Quantcast
Channel: Statalist
Viewing all 72835 articles
Browse latest View live

Fixed effects in RD plots

$
0
0
Hi, I am trying to generate RD plots with and without year FEs, but the graph doesn't look smoothed after the addition of FEs.

Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input int year float(running_var log_y uid)
2014  89.366  8.973331  1
2016  89.366  9.188673  1
2014  37.102  8.661576  2
2016  37.102  8.221179  2
2014  14.554  7.152594  3
2016  14.554  6.596597  3
2014   22.62  7.859138  4
2016   22.62  7.603361  4
2014  23.115  7.621384  5
2016  23.115  7.563481  5
2014  15.393  7.801198  6
2016  15.393  7.044147  6
2014   15.63  7.341237  7
2016   15.63  7.464042  7
2016  11.449  6.591065  8
2014  29.235  8.176641  9
2016  29.235   8.03603  9
2014  16.534  7.577722 10
2016  16.534   7.73528 10
2014  16.996  7.900858 11
2014 110.059  8.804719 12
2016  59.848  8.576594 13
2014   16.12  7.959042 14
2016   16.12  7.199755 14
2014 114.337  8.703652 15
2014  14.512  6.587711 16
2014  28.146  7.484158 17
2016  28.146  7.606704 17
2014   15.54  7.448243 18
2016   15.54  7.511616 18
2014 100.203  8.734136 19
2014  67.393  8.499454 20
2016  67.393  8.581483 20
2014  25.546  8.181071 21
2014  115.07  8.316578 22
2016  115.07    8.1924 22
2014  20.347  7.261263 23
2016  20.347  7.078819 23
2014  32.531   7.30103 24
2016  32.531  7.018701 24
2016  36.631  7.815246 25
2016  78.586  7.913814 26
2014  51.005 8.5136175 27
2016  51.005  8.789665 27
2014  46.133  7.859138 28
2016  46.133  7.309843 28
2014   19.01  9.022535 29
2016   19.01  8.146841 29
2014  73.146    8.1924 30
2016  73.146 8.6155815 30
2014   22.46  7.180985 31
2016   22.46  6.705008 31
2014  35.318  8.617797 32
2016  35.318  8.604086 32
2014  11.704  7.692583 33
2016  11.704  8.095413 33
2014  11.989  7.341632 34
2016  11.989  6.267172 34
2016  44.108  7.347915 35
2014  27.292  8.050651 36
2016  27.292  7.631241 36
2014 137.474  7.270213 37
2016 137.474  8.610766 37
2014  15.763  7.684396 38
2016  15.763  7.104487 38
2014  13.761  7.735838 39
2016  13.761  7.764027 39
2014  20.421  7.312177 40
2016  20.421   7.75151 40
2014 161.595  8.942698 41
2016 161.595  8.415491 41
2014  11.393  7.905742 42
2016  11.393  7.720407 42
2014  11.501  7.876102 43
2014  14.982   6.91169 44
2016  14.982  7.002598 44
2014  24.247  8.088526 45
2016  24.247  8.295765 45
2014  23.309  8.386142 46
2016  23.309 8.5717325 46
2014  10.428  9.830735 47
2016  10.428  8.745949 47
2014 129.653  8.197004 48
2016 129.653  8.308928 48
2014  14.022  6.466868 49
2016 108.738  7.598353 50
2014  72.212  7.382017 51
2016  17.383  6.932474 52
2014  25.646  7.209515 53
2016  25.646  6.599883 53
2014  18.232  7.209783 54
2016  18.232  7.653213 54
2016   11.03  7.223755 55
2014  28.753  8.414756 56
2016  28.753  8.177854 56
2014  33.928  8.362784 57
2016  33.928  8.072066 57
2014  39.279  8.005952 58
2016  39.279    8.2108 58
2014  90.547  7.536053 59
end
Code:
preserve
xi i.year
reg log_y c.running_var##c.running_var _Iyear* if running_var < 100, vce(cluster uid)
        predict yhat_left if running_var < 100, xb
        predict stdp_left if running_var < 100, stdp
       
        reg log_y c.running_var##c.running_var _Iyear* if running_var > 100 , vce(cluster uid)
        predict yhat_right if running_var > 100, xb
        predict stdp_right if running_var > 100, stdp

foreach side in left right {
        gen cipos_`side' = yhat_`side' + (1.96 * stdp_`side')
        gen cineg_`side' = yhat_`side' - (1.96 * stdp_`side')
    }

    keep running_var *_left *_right
    tempfile fit
    save `fit'

    * bin data
    restore, preserve
    
    qui: summ running_var
    egen bin = cut(running_var), at(0(10)200)
    replace bin = bin + 5
    collapse (mean) running_var log_y, by(bin)

    * --- Make plot
    append using `fit'
    twoway ///
        (line yhat_left running_var, sort(running_var) lcolor(black) lpat(solid) lwidth(medthick)) (line cipos_left running_var, sort(running_var) lcolor(black) lpat(shortdash) lwidth(thin)) (line cineg_left running_var, sort(running_var) lcolor(black) lpat(shortdash) lwidth(thin)) ///
        (line yhat_right running_var, sort(running_var) lcolor(black) lpat(solid) lwidth(medthick)) (line cipos_right running_var, sort(running_var) lcolor(black) lpat(shortdash) lwidth(thin)) (line cineg_right running_var, sort(running_var) lcolor(black) lpat(shortdash) lwidth(thin)) ///
        (scatter log_y running_var, xline(100) mcolor(black)), ///
        legend(off) graphregion(color(white)) xtitle("`xtitle'")  ytitle("`ytitle'") xline(100, lpattern(shortdash) lc(black)) ylab(, nogrid) note("`note'")
restore
Try running the regressions with and without "_Iyear*" term, and you will see the difference. What do I do to have the year fixed effects and have a smoothed graph as well?

Thanks!

Latex with Stata

$
0
0
Hi

what is the command to present the results of summarize and esttab commands in latex, please

Graph Workflow (https://graphworkflow.com/)

$
0
0
My apologies for the unsolicited ad, but I believe that this announcement could be of interest to some Stata users.

Using Stata, I have developed online resources that describe, what I call, a Graph Workflow approach to developing data graphs. My thoughts are collected here:

https://graphworkflow.com/

The ultimate aim of this website is to guide scientific work in applying a systematic approach towards data graphing for publishing accurate, informative and efficient graphs for reporting, exploratory and inferential purposes.

All graphs in this website are created using Stata code, which I provide for exact reproduction at the end of each major page. But the purpose of the website is not to teach Stata. If you know Stata then you can take the code and apply it to your own work.

The home page of https://graphworkflow.com/ offers a random palette of graphs for browsing content by type. This unstructured learning suits those who are interested in finding out how to make one of these types of graphs. Just click on one, read the approach and download the Stata code at the end of each page to learn how to make it.

For more structured learning, you must master the so-called Graph Workflow model. Click on the link "LEARN THE WORKFLOW MODEL" shown in the top right corner of every webpage to reveal a table of contents. Try to read these pages in the order that they are provided. Start with the Graph Workflow model page. Think of this as a table of contents of an online manual for data graphing.

Please note that there is still a lot of tiding up to do, more code to release, and a handful of pages are flagged as under construction (labelled as TBD - to be developed). I planned to make this announcement in a few weeks when I was ready, but Stata has posted a link on its Facebook page today so this is out in the open.

I hope that you will find these resources useful. I welcome all feedback, and especially constructive feedback that can help me improve these learning resources even more.

Regards,
Demetris Christodoulou

MAC OSX directory

$
0
0
Hello,

I am using a Mac OS X, and I am sharing a do file with others who are using PC.

I need to covert the codes into Mac version and I would like some help.

For the working directory:
global maindir1 "/Users/__/desktop/..."
works fine, but

capture use "/maindir1/....dta",clear
does not work. (capture use "$maindir/....dta", clear does not work either)

Can anyone help me fix the route to use the main directory?

Thank you very much!

Selecting Controls based on time from admission to event

$
0
0
Hello,

I have a dataset of unique observations that have been grouped according to whether the participant had an event (Event = Yes) or did not have an event (Event = No). I have too few events in the dataset compared to controls so I would like to keep all the cases, N = 5 in the dataset below. For each case, I would like to select only the 2 closest controls based on the time from admission to event (So for example, case 1 closest match based on admission to having the event will be control 16 and 23). I tried to use Psmatch for this but it gets rid of my cases or requires that I match on other factors. Thank you in advance.


Code:
* Example generated by -dataex-. To install: ssc install dataex
clear
input byte id int(Admitdate eventdate) str3 Event float time
 1 20089 20106 "Yes"  17
 2 20577 20665 "No"   88
 3 20226 20228 "Yes"   2
 4 20227 20229 "Yes"   2
 5 20972 20982 "No"   10
 6 20281 20315 "No"   34
 7 20820 20911 "No"   91
 8 20211 20240 "No"   29
 9 20226 20237 "No"   11
10 20241 20254 "No"   13
11 20270 20272 "Yes"   2
12 20298 20344 "No"   46
13 20089 20097 "No"    8
14 20211 20342 "Yes" 131
15 20226 20230 "No"    4
16 20089 20108 "No"   19
17 20577 20590 "No"   13
18 20226 20233 "No"    7
19 20227 20228 "No"    1
20 20607 20633 "No"   26
21 20089 20152 "No"   63
22 20211 20278 "No"   67
23 20592 20607 "No"   15
24 20241 20350 "No"  109
end
format %tdnn/dd/CCYY Admitdate
format %tdnn/dd/CCYY eventdate

svg layers disappearing

$
0
0
Hi all,

I'm hoping for some advice in trouble shooting. I'm having trouble export svg files - my layers keep disappearing (I think). I tried to create an example but it worked perfectly, so now I'm just confused. The below code creates a histogram in an svg file that displays fine on Word.

Code:
sysuse lifeexp

twoway (histogram lexp if region == 2, fcolor(ebblue) frequency) ///
(histogram lexp if region!=2, fcolor(purple%50) frequency)

graph export "$graphs/mygraph.svg", replace
However when I try the below code, using real data (22K obs, 'dob)_y' is a double type) , I get empty layers, as per the image below. (Note this is an annotated screenshot because I don't think I can upload .svgs files.) Even when I 'save as' from the graph window, where my graph displays perfectly, I get the same thing. There is a very small amount of data in the bottom left corner - I was wondering if somehow I had set things up to have an enormous graph area, but I'm not sure how to check for that. I'd be grateful for any troubleshooting advice.

Code:
twoway (histogram dob_y, fcolor(ebblue) frequency ) ///
(histogram dob_y if mar==1, fcolor(purple) frequency )
graph export "$graphs/freq_byAge.svg", replace

Array

Storing vec coefficient

$
0
0
Hi,

I have tried to store the coefficient of the vec results.

The code I used is as below.

Code:
use data_file, clear
levelsof id, local(id)
gen coefficient1 = .
gen coefficient2 = .
sort id date
by id: gen time = _n
xtset id time
foreach c of local id {
    vec d.y d.x1 d.x2, lags(5), if id == `c'
    replace coefficient1 = _b[x1] if id == `c'
    replace coefficient2 = _b[x2] if id == `c'
}
The result indicates
[x1] not found

I am not sure what to put inside _b[x1], could someone kindly suggest please?

vce(cluster) with lsdvc

$
0
0
Hello..i am stata 13 user. I want to integrate vce (cluster) with xtlsdvc Log_sales varietyofproducts LogGDP searchvolume Log_firmage sqrt_avg_rev_len sqrt_numberofreviews SQRT_comfort_neg_perc SQRT_safety_neg_perc SQRT_exterior_neg_perc yr2009 yr2010 yr2011 yr2012 yr2013 yr2014 yr2015 yr2016, initial(ab) bias(1) vcov(1000) as in my case dataset has both heteroskedasticity and serial correlation.
kindly share me the command for it and guide me on how to integrate them.Thank you.

How to hide a specific line property appears in x axis in twoway plot?

$
0
0
Dear Stata users, I want to hide a specific line that appears in the x-axis. Here is my code for twoway plot,
sysuse auto.dta
g b=_n
twoway line price weight b

Then, the graph appears. I have attached the graph. Now, I want to delete only "price" from the x-axis. Please suggest

Binary Response Model

$
0
0
I want to estimate the probability of authoritarian leadership failure as a function of five explanatory variables.
I ran a logit regression with the following functional form:
P r(Fail) = frac + oil + growth + coup + gdp

• I want to Create a figure that shows how the predicted probability of failure changes as I move
the ”(logged)oil” variable by 0.1 from -6 to 10.

• How I Calculate the percentage change in the odds of leadership failure when the value of
coup changes from 0 to 1, while holding other variables constant at their mean values.

Analysing multiple response variables

$
0
0
I want to analyse the relationship (using Chi-square test) between (AgeNEW, Gender) Vs (Q42-production problems people encounter in pigeon pea production). The challenge is that Q42 is a Multiple response question whose options are broken down in variables Q42a-Q42m. Is there a away i can generate a single variable capturing the variables Q42a-Q42m as categories under the new variable?
If yes, how do i go about it?
If no, please guide me on how best i can analyse this multiple response question.

GSEM error

$
0
0
Hello,
Am trying to use gsem probit and finf the following errors:

HTML Code:
     _gsem_d2Sigma_db2():  3301  subscript invalid
_gsem_eval_setup__chol_hinfo():     -  function returned error
_gsem_eval_setup__chol():     -  function returned error
      _gsem_eval_setup():     -  function returned error
 _gsem_eval_chol__work():     -  function returned error
       _gsem_eval_chol():     -  function returned error
      mopt__calluser_v():     -  function returned error
       opt__eval_nr_v2():     -  function returned error
             opt__eval():     -  function returned error
opt__looputil_iter0_common():     -  function returned error
opt__looputil_iter0_nr():     -  function returned error
          opt__loop_nr():     -  function returned error
            _moptimize():     -  function returned error
           Mopt_maxmin():     -  function returned error
                 <istmt>:     -  function returned error

Winsor2 only for one year

$
0
0
Hello everybody,

I have a dataset with many years and want to use winsor2 only for the year 2011 (I do not want to drop the rest since I need them later on).
How is that possible? My code looks like this

winsor2 envcsr, suffix(_w) cuts(1 99)

Thank you very much in advance!

import many files and generate an `id' variable as the file name.

$
0
0
Dear All, Suppose that I have the following 3 files (in fact, about 4,000 similar files). The file name is 1, 2, 4 with extension xlsx. I wish to import each file, then generate an `id' variable with values all equal to the file name 1, 2, and 4, for all observations in the file, respectively. Finally, I need to -append- them together. Any suggestions?

Mediation analysis with Sem and Ols

$
0
0
Hi all,
I am currently working on mediation analysis and effects decomposition. Since one of my main variable is a latent variable and I have different indirect paths, I just opted for structural-equation model. However, just for a robustness check, I would like to compute the mediation also with a standard Ols approach and then thinking to use the Sobel test for checking the significance.

Therefore, the idea is the following:
Code:
reg log_income X P // where X is a matrix of individual characteristics (mostly personality traits) and P is father's social class 

reg log_income X P E // where E is the mediating variable i.e. education level of the individual
The first regression represents the total effect and subtracting the second direct effect I can obtain the indirect effects. However, it seems it's not available a Sobel test in Stata anymore. How can I test the significance of the indirect effect in this case? Or is there any other way for implementing this "robustness check"?
Thanks for your help

Quantile regression (QR) for fixed effect panel data : which command is more approriated ?

$
0
0
Dear all,

i'm working on may PhD thesis ( the impact of risk taking on firm performence) using a date of 781 observations ( T=11).

i would like to apply the QR but i do not know which command i have to use : qreg2 , ivqreg2 or xtqreg !

(for my modeste knowledge i think that i cannot use xtqreg because T is too short)

i would be gratful if you can halep me with it

kind regards
sedki

Past event absorb

$
0
0
Hi there,


I need the stockprice of the last day from the year before the announcementdate

I have all the stock-data and the announcement dates

For example:

The announcement date is 20-01-2010
Then I need the stockprice from the last day from the prior year

My dataset contains of multiple announcement days



Who could help me out?

CEM: Stata vs R

$
0
0
Dear StataListers,

I can't get why the analyzes below, performed in Stata / R, do not produce the same result.

Please provide your suggestions for the explanation for these differences.

Thanks,
Martine

PS (how) can I display the actual cutpoints in Stata (imb)?


-- STATA
. import delimited "G:\Recidive\Projecten\Verkeer\Algemeen\Artike l effectiviteit ASP tijdens\Data\Data voor R.csv", clear
(8 vars, 8,627 obs)

. imb d_sekse lftbegrec d_addinfo_ind_best lftinsz1 vgalgexuz vgverkexuz vgrijoiuz, treatment (treat)
(using the scott break method for L1 distance)

Multivariate L1 distance: .68243298

Univariate imbalance:

L1 mean min 25% 50% 75% max
d_sekse .01557 -.01557 0 0 0 0 0
lftbegrec .0687 -1.5626 0 -1 -2 -1 -5
d_addinfo_ind_best .00497 -.00497 0 0 0 0 0
lftinsz1 .08852 .91045 0 1 1 2 -12
vgalgexuz .14912 -1.2611 0 0 -1 -1 78
vgverkexuz .09852 -.41727 0 0 0 -1 4
vgrijoiuz .11859 -.36406 0 0 0 0 -4

--R
> data <- read.csv(file = "G:\\Recidive\\Projecten\\Verkeer\\Algemeen\\Artik el effectiviteit ASP tijdens\\Data\\Data voor R.csv")
> cov <- c("D_SEKSE", "LFTBEGREC", "D_ADDINFO_IND_BEST","LFTINSZ1", "VGALGEXUZ", "VGVERKEXUZ", "VGRIJOIUZ")
> imb <- imbalance(group = data$TREAT, data = data[cov])
> imb

Multivariate Imbalance Measure: L1=0.665
Percentage of local common support: LCS=11.2%

Univariate Imbalance Measures:

statistic type L1 min 25% 50% 75% max
D_SEKSE -0.015571508 (diff) 0.015571508 0 0 0 0 0
LFTBEGREC -1.562561051 (diff) 0.025019213 0 -1 -2 -1 -5
D_ADDINFO_IND_BEST -0.004970458 (diff) 0.004970458 0 0 0 0 0
LFTINSZ1 0.910454843 (diff) 0.045867098 0 1 1 2 -12
VGALGEXUZ -1.261056065 (diff) 0.146272471 0 0 -1 -1 78
VGVERKEXUZ -0.417271627 (diff) 0.097957912 0 0 0 -1 4
VGRIJOIUZ -0.364063976 (diff) 0.118028583 0 0 0 0 -
4

Renaming multiple excel files

$
0
0
Hi everyone,

I wish to rename some excel files on STATA so that only the first 4 characters appear for each file. I ran the following command:

local files: dir "C:\Users\Dell\OneDrive - The University of Nottingham\DISE" files "*.xlsx"
foreach f of local files {
local fnew = substr("`f'",1,4)
!rename "`f'" "`fnew'.xlsx"
}
macro list `fnew'

But when I view the file names using the last command, I see no change in their names. Could anyone please help with this?
Thanks!

Generate Variable - StataIC - Panel Data

$
0
0
Hello,

I am using StataIC and the panel data file mathpnl. with 2750 oberservations and 57 variables.
The data file contains panel data for district's expenditures per student from 1992 to 1998.

In order to run an IV regression, where I want to control for initial spending in 1992, I need to generate a new variable.

I have tried the code:

// gen exp1992= rexpp if year == 1992 //

and this gives me a new variable with the value for expenditures (only) in the year 1992.
But actually I want my dataset to look like the example in the table below (with the thick letter), where it gives me the value of expenditures in 1992 for every observation.
rexpp year exp1992
125.312 1992 125.312
176.418 1993 125.313
I have tried several codes but it didn't work out.
Is there any possibility to adjust my code in order to receive the result?
Viewing all 72835 articles
Browse latest View live