Derive S&P1500 sample

May 15, 2017, 6:00 am

≫ Next: Boxplots do not appear as expected

≪ Previous: using keep option in outreg and not defining specific variables, but to keep the first variable of each regression output

Dear Forum,

I have a question on how to create a sample of S&P 1500 firms.

What I need:
I need to collect a sample of all firms in the S&P1500 from 1990 to 2010. I use WRDS and Stata and I need information on accounting and stock data.

More specifically:

If a firm was in the S&P1500 for at least one year (= t0) I would need data on this firm in the range from t-3 to t4.

Example 1: Firm A appears in the S&P1500 in the year 2000 (any month). Than, I would need data on this firm from 1997 to 2004.
Example 2: Firm B appears in the S&P1500 in the years 1995, 1996 and 1997. Than, I would need data on this firm from 1992 to 2001.

What I have done so far:

As I need accounting and stock data I turned to the CRSP-Compustat-Merged database, searched for all firms in the database and merged it with CRSP.

In WRDS, I used the „Compustat Monthly Updates - Index Constituents“ and searched for all firms from Jan 1990 to December 2010, using the „i0020“-ticker (which is for the S&P1500).

As a result, I get a list of all companies with a „from“ and „thru“ date constituting the entry and exit date in/from the S&P1500. The variables „from“ and „thru“ are in the format: „14feb2010“

My idea on how to solve this

- transform the two variables „from“ and „thru“ to a „long“ format on a yearly base

Current format:

gvkey	From	Thru	Index
001	10sep1999	03aug2001	S&P1500
002	04jan1996	24mar1997	S&P 1500

Target format:

gvkey	year	index (changed to dummy var)
001	1999	1
001	2000	1
001	2001	1
002	1996	1
002	1997	1

- use (or create a) an unique identifier for the S&P1500 firms
- merge the index data with my CRSP/Compustat merged database on gvkey and year (fyear = year)
- question: the crsp-compustat-merged and compustat data which I merged before are merged on permno. is it a problem if I use gvkey for the
S&P1500-„merge“?

My questions

—> does this approach make sense?

—> if so, how could I do the transformation part. I have tried several ways but was not able to do it

Any guidance on how to do this is highly appreciated!

Thanks a lot,

Samuel

↧

Boxplots do not appear as expected

May 15, 2017, 6:12 am

≫ Next: Reshape

≪ Previous: Derive S&P1500 sample

Dear All,

I am using Stata IC 14.2 (64-bit).

I am developing boxplots (sorted by a nominal variable: n200_cab_blind). Some of the boxes appear different (lighter in color) than the rest. There is no part of my code that dictates that difference in color. The boxes that are affected are always on the right-hand side of my graphs.

I have included the graph, as well as the code that I use.

Code:

egen median= median(bladderlowcoretemp), by(n200_cab)

sum bladderlowcoretemp, det

di r(p50)

local x=r(p50)

local x: di %9.1f `x'

graph box bladderlowcoretemp if bladderlowcoretemp~=., yline(`r(p50)') ylabel(, angle(0)) over(n200_cab_blind, label(angle(rvertical) labsize(small)) sort(median)) nooutsides ytitle("Temperature, degrees") title ("Lowest Core Temp (C) by Center") note("Isolated CAB" "Only Includes Cases Matched to Surgical Records" "Site for Core Temp was Bladder" "Red line:median value across the database is:`x' degrees")

graph export "Blinded\Temperature Management\lowesttemp_isolcabg.wmf", replace

I appreciate help from the group.

↧

Reshape

May 15, 2017, 8:43 am

≫ Next: How to convert all variables with different currencies to one-currency-based variables in a big dataset?

≪ Previous: Boxplots do not appear as expected

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input str11 id str54 entity str6 int_period
"101825" "BRANCH"     "before"
"101805" "BRANCH"     "after" 
"101744" "BRANCH"     "before"
"101825" "BRANCH"     "before"
"101825" "BRANCH"     "before"
"101744" "BRANCH"     "before"
"101825" "BRANCH"     "before"
"101809" "EquityBulk" "before"
"100879" "EquityBulk" "after" 
"100088" "EquityBulk" "before"
"101734" "EquityBulk" "before"
"101781" "EquityBulk" "before"
"101572" "EquityBulk" "before"
"101553" "EquityBulk" "before"
end

In the above data, i would like to reshape wide such that i would have the categories of ENTITY as new variables and each and then and then before each ENTITY to indicare Int_period category

↧

How to convert all variables with different currencies to one-currency-based variables in a big dataset?

May 15, 2017, 8:47 am

≫ Next: how to retrieve data from a string?

≪ Previous: Reshape

Hello everyone!

I have a trouble about cleaning a big dataset that really needs your help now. Would you possibly have a look for me?

Since all variables are recorded by different reporting currencies, what I want to do is to convert all variables recorded by different currencies to one-currency-recorded variables, that is EUR. For example, there are two variables recorded by three different currencies, i.e., GBP, EUR and USD. Since the dataset is quite big, it's impossible for me to convert it by hand. By using the exchange rates, how can I convert all variables recorded by GBP and USD to only EUR based variables in Stata?

The dataset is similar to the following:

id NAV GAV Reporting currency
1 10m 20M GBP
2 15m 23M GBP
3 20m 24M GBP
4 26m 30M GBP
5 28m 32M USD
6 24m 23M USD
7 22m 33M USD
8 24m 30M EUR
9 40m 31M EUR
10 34m 32M EUR
... .... ....

Many thanks for your time, I will be most grateful if you can help me out!

Best,

Jae

↧

how to retrieve data from a string?

May 15, 2017, 9:06 am

≫ Next: Stata Procedure Class Issue with HCUP provided do file

≪ Previous: How to convert all variables with different currencies to one-currency-based variables in a big dataset?

Hi everybody!
I have a doubt with Stata. I have a file with data with lots of files and just one column. The column has format text and has entries such as:

TIR SOCIAL: 7.31 - TIR SOCIAL: 6.53 - VAN SOCIAL: 322218 - VAN SOCIAL: 449425

or

TIR SOCIAL: 0 - TIR SOCIAL: 3.23 - TIR SOCIAL: 4.11

My point is that I need Stata to locate every time that TIR Social appears and generate column 2 with the number 7.31 and column 3 with the number 6.53. For the second row I need to have the value 0 in the second column, the value 3.23 in the 3rd column and the value 4.11 in the 4th columns.

Is it possible to do such a thing with Stata?
(I am usins Stata 11.2)

Best regards,
Alex

↧

Stata Procedure Class Issue with HCUP provided do file

May 15, 2017, 9:58 am

≫ Next: Fixed/random effects for a non-randomly unbalanced panel

≪ Previous: how to retrieve data from a string?

I am using an HCUP provided do file listed below:

* Generate a unique identifier
*
egen _obs = seq()
*
* Reshape the data into long format with one observation per procedure
*
reshape long pr, i(_obs) j(prnum)
*
* Generate a temporary procedure variable that will be reformatted by the clean function in preparation for the merge
*
generate _pr = pr
*
* Check the validity of the procedure
*
capture: icd9p check _pr, generate(invalid)
*
* replace invalid temporary diagnoses in preparation for the clean function
*
replace _pr="0000" if invalid > 0 & invalid < 10
drop invalid
*
* Format the temporary procedure with a decimal to match the format in PRclass.dta. Sort by formatted procedure.
*
icd9p clean _pr, dots
sort _pr
*
* Merge the Procedure Class variable, PCLASS, that matches the temporary procedure
*
merge _pr using PRclass, nokeep
*
* Drop temporary variables and put data in original shape
*
drop _merge _pr
reshape wide pr PCLASS, i(_obs) j(prnum)
drop _obs

This is the data structure the do file is intended to run on

_pr I9Description PCLASS _obs
0001 THERAP ULTRASOUND OF HEAD AN (Begin 2002) 2 1
0002 THERAPEUTIC ULTRASOUND OF HE (Begin 2002) 2 2

This is the dataex output

----------------------- copy starting from the next line -----------------------

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input str6 _pr str60 I9Description str1 PCLASS int _obs
"0001" "THERAP ULTRASOUND OF HEAD AN (Begin 2002)"                    "2"   1
"0002" "THERAPEUTIC ULTRASOUND OF HE (Begin 2002)"                    "2"   2
"0003" "THERAP ULTRASOUND PERIPHRL V (Begin 2002)"                    "2"   3
"0009" "OTHER THERAPEUTIC ULTRASOUND (Begin 2002)"                    "2"   4
"0010" "IMPLANTATION OF CHEMOTHERAPE (Begin 2002)"                    "2"   5
"0011" "INFUSION DROTRECOGIN ALFA (A (Begin 2002)"                    "2"   6
"0012" "ADMINISTRATION OF INHALED NI (Begin 2002)"                    "2"   7
"0013" "INJECTION OR INFUSION OF NES (Begin 2002)"                    "2"   8
"0014" "INJECT/INFUS OF OXAZOLIDINON (Begin 2002)"                    "2"   9
"0015" "HIGH-DOSE INFUSION IL-2 (Begin 2003)"                         "2"  10
"0016" "PRESSURIZED TREAT GRAFT (Begin 2004)"                         "2"  11
"0017" "INFUSION OF VASOPRESSOR (Begin 2004)"                         "2"  12
"0018" "INFUS IMMUNOSUP ANTIBODY (Begin 2005)"                        "2"  13
"0019" "BBBD VIA INFUSION (Begin 2007)"                               "2"  14
"0021" "IVUS EXTRACRAN CEREB VES (Begin 2004)"                        "1"  15
"0022" "IVUS INTRATHORACIC VES (Begin 2004)"                          "1"  16
"0023" "IVUS PERIPHERAL VESSELS (Begin 2004)"                         "1"  17
"0024" "IVUS CORONARY VESSELS (Begin 2004)"                           "1"  18
"0025" "IVUS RENAL VESSELS (Begin 2004)"                              "1"  19
"0028" "INTRAVASCUL IMAGING NEC (Begin 2004)"                         "1"  20
"0029" "INTRAVASCUL IMAGING NOS (Begin 2004)"                         "1"  21
"0031" "CAS W CT/CTA (Begin 2004)"                                    "1"  22
"0032" "CAS W MR/MRA (Begin 2004)"                                    "1"  23
"0033" "CAS W FLUOROSCOPY (Begin 2004)"                               "1"  24
"0034" "IMAGELESS COMP ASST SURG (Begin 2004)"                        "1"  25
"0035" "CAS W MULTIPLE DATASETS (Begin 2004)"                         "1"  26
"0039" "OTHER CAS (Begin 2004)"                                       "1"  27
"0040" "PROCEDUREONE VESSEL (Begin 2005)"                             "2"  28
"0041" "PROCEDURETWO VESSELS (Begin 2005)"                            "2"  29
"0042" "PROCEDURETHREE VESSELS (Begin 2005)"                          "2"  30
"0043" "PROCEDUREFOUR+ VESSELS (Begin 2005)"                          "2"  31
"0044" "PROC-VESSEL BIFURCATION  (Begin 2006)"                        "2"  32
"0045" "INSERT 1 VASCULAR STENT (Begin 2005)"                         "2"  33
"0046" "INSERT 2 VASCULAR STENTS (Begin 2005)"                        "2"  34
"0047" "INSERT 3 VASCULAR STENTS (Begin 2005)"                        "2"  35
"0048" "INSERT 4+ VASCULR STENTS (Begin 2005)"                        "2"  36
"0049" "SUPERSAT O2 THERAPY (begin 2008)"                             "2"  37
"0050" "IMPLA RESYNCHR PACEMAKER W/0 (Begin 2002)"                    "4"  38
"0051" "IMPLA RESYNCHRONIZATION DEFI (Begin 2002)"                    "4"  39
"0052" "IMPL/REPL TRANSVENOUS LEAD L (Begin 2002)"                    "4"  40
"0053" "IMPL/REPL PACEMAKER PLSE GE (Begin 2002)"                     "4"  41
"0054" "IMPL/REPL DEFIBRIL GENERATOR (Begin 2002)"                    "4"  42
"0055" "INSERT DRUGELUTING NONCRNRY (Begin 2002)"                     "2"  43
"0056" "INS/REP IMPL SENSOR LEAD (Begin 2006)"                        "4"  44
"0057" "IMP/REP SUBCUE CARD DEV  (Begin 2006)"                        "4"  45
"0058" "INS INTRA-ANSM PRES MNTR (begin 2008)"                        "1"  46
"0059" "INTRAVASC MSMNT COR ART (begin 2008)"                         "1"  47
"0060" "INS D-E STNT SUP FEM ART (Begin 2010)"                        "2"  48
"0061" "PERC ANGIO PRECEREB VESS (Begin 2004)"                        "4"  49
"0062" "PERC ANGIO INTRACRAN VES (Begin 2004)"                        "4"  50
"0063" "PERC INS CAROTID STENT (Begin 2004)"                          "2"  51
"0064" "PERC INS PRECEREB STENT (Begin 2004)"                         "2"  52
"0065" "PERC INS INTRACRAN STENT (Begin 2004)"                        "2"  53
"0066" "PTCA OR CORONARY ATHER (Begin 2005)"                          "4"  54
"0067" "INTRAVAS MSMNT THORC ART (begin 2008)"                        "1"  55
"0068" "INTRAVAS MSMT PERIPH ART (begin 2008)"                        "1"  56
"0069" "INTRAVS MSMT VES NEC/NOS (begin 2008)"                        "1"  57
"0070" "REV HIP REPLACETAB/FEM (Begin 2005)"                          "4"  58
"0071" "REV HIP REPLACETAB COMP (Begin 2005)"                         "4"  59
"0072" "REV HIP REPLFEM COMP (Begin 2005)"                            "4"  60
"0073" "REV HIP REPLLINER/HEAD (Begin 2005)"                          "4"  61
"0074" "HIP REPL SURFMETAL/POLY (Begin 2005)"                         "2"  62
"0075" "HIP REP SURFMETAL/METAL (Begin 2005)"                         "2"  63
"0076" "HIP REP SURFCERMC/CERMC (Begin 2005)"                         "2"  64
"0077" "HIP REPL SURF-CERMC/POLY (Begin 2006)"                        "2"  65
"0080" "REV KNEE REPLACEMTTOTAL (Begin 2005)"                         "4"  66
"0081" "REV KNEE REPLTIBIA COMP (Begin 2005)"                         "4"  67
"0082" "REV KNEE REPLFEMUR COMP (Begin 2005)"                         "4"  68
"0083" "REV KNEE REPLACEPATELLA (Begin 2005)"                         "4"  69
"0084" "REV KNEE REPLTIBIA LIN (Begin 2005)"                          "4"  70
"0085" "RESRF HIPTOTAL-ACET/FEM (Begin 2006)"                         "4"  71
"0086" "RESRF HIPPART-FEM HEAD  (Begin 2006)"                         "4"  72
"0087" "RESRF HIPPART-ACETABLUM (Begin 2006)"                         "4"  73
"0091" "TRNSPLNT LIVE REL DONOR (Begin 2004)"                         "2"  74
"0092" "TRNSPLNT LIVE NONREL DONOR (Begin 2004)"                      "2"  75
"0093" "TRANSPLANT CADAVER DONOR (Begin 2004)"                        "2"  76
"0094" "INTRA-OP NEUROPHYS MONTR (Begin 2007)"                        "1"  77
"0095" "INJECTION OR INFUSION OF GLUCARPIDASE (Begin 2012)"           "2"  78
"0096" "INFUSION OF 4-FACTOR PROTHROMBIN COMPLEX CONCENTRATE��������" "2"  79
"0101" "CISTERNAL PUNCTURE"                                           "2"  80
"0102" "VENTRICL SHUNT TUBE PUNC"                                     "2"  81
"0109" "CRANIAL PUNCTURE NEC"                                         "2"  82
"0110" "INTRACRAN PRESSURE MONTR (Begin 2007)"                        "1"  83
"0111" "CLOS CEREB MENINGES BX"                                       "1"  84
"0112" "OPEN CEREB MENINGES BX"                                       "3"  85
"0113" "CLOSED BRAIN BIOPSY"                                          "1"  86
"0114" "OPEN BRAIN BIOPSY"                                            "3"  87
"0115" "SKULL BIOPSY"                                                 "3"  88
"0116" "INTRACRANIAL 02 MONITOR (Begin 2007)"                         "1"  89
"0117" "BRAIN TEMP MONITORING (Begin 2007)"                           "1"  90
"0118" "OTHER BRAIN DX PROCEDURE"                                     "3"  91
"0119" "OTHER SKULL DX PROCEDURE"                                     "3"  92
"0120" "IMP/REPL BRAIN PULSE GEN (Begin 2010)"                        "4"  93
"0121" "CRANIAL SINUS I & D"                                          "4"  94
"0122" "REMOV INTRACRAN STIMULAT"                                     "4"  95
"0123" "REOPEN CRANIOTOMY SITE"                                       "4"  96
"0124" "OTHER CRANIOTOMY"                                             "4"  97
"0125" "OTHER CRANIECTOMY"                                            "4"  98
"0126" "INS CATHCRANIAL CAVITY (Begin 2005)"                          "2"  99
"0127" "REM CATHCRANIAL CAVITY (Begin 2005)"                          "2" 100
end

------------------ copy up to and including the previous line ------------------

Listed 100 out of 3948 observations
Use the count() option to list more

This is the error in output.

. reshape long pr, i(_obs) j(prnum)
no xij variables found
You typed something like reshape wide a b, i(i) j(j).
reshape looked for existing variables named a# and b# but could not find any. Remember this picture:

long wide
+---------------+ +------------------+
| i j a b | | i a1 a2 b1 b2 |
|---------------| <--- reshape ---> |------------------|
| 1 1 1 2 | | 1 1 3 2 4 |
| 1 2 3 4 | | 2 5 7 6 8 |
| 2 1 5 6 | +------------------+
| 2 2 7 8 |
+---------------+

long to wide: reshape wide a b, i(i) j(j) (j existing variable)
wide to long: reshape long a b, i(i) j(j) (j new variable)

.
Any ideas on where the reshape function is not working?

Thanks

↧

Fixed/random effects for a non-randomly unbalanced panel

May 15, 2017, 10:02 am

≫ Next: Create dummy set for logistic regression

≪ Previous: Stata Procedure Class Issue with HCUP provided do file

Hi there,

I have an unbalanced panel dataset consisting of a large number of loans with their repayments observed over time.

The panel is definitely non-randomly unbalanced as loans drop out of the dataset after the time-period of full-repayment.

Can anyone suggest a way to correct the fixed and random effects models for this?

Many thanks.
David

↧

Create dummy set for logistic regression

May 15, 2017, 10:03 am

≫ Next: Failure to include double in local macro, even when using compound double quotes

≪ Previous: Fixed/random effects for a non-randomly unbalanced panel

Hi!

I am using time series data with peacekeeping mission-years observations. The dependent variable is a dummy variable of allegations of sexual exploitation and abuse by peacekeepers. I have various independent variables in the dataset, including gdp, the mission size, women mentioned in the mandate, the gender composition of the mission etc.

Now I have created dummy variables for every country that has ever contributed troops to a certain mission: (1) if it has and (0) if not. For example albania_TCC is coded for every mission-year observation either 1 or 0, based on the fact that albania has contributed troops or not. I have 117 of these dummy country variables. I would like to find out whether the contribution of troops by a certain country increases or is associated with the likelihood of allegations of sexual exploitation and abuse by peacekeepers. I want to find out if the presence and contribution of a country is somehow correlated with allegations of sexual exploitation and abuse.

My supervisor suggested to run the regression as a dummy set with having one country as the reference category. I have done some research and found the function of tabulate, but I didn't really work.

Could someone help me?

Thanks a lot in advance!

Céline

↧

Failure to include double in local macro, even when using compound double quotes

May 15, 2017, 10:27 am

≫ Next: Multilevel Longitudinal Analysis

≪ Previous: Create dummy set for logistic regression

Dear all,

I have some country level panel data and have been contructing a twoway plot where I wanted to include a separate line for eachcountry, overlayed. Since the set of countries that enters the plot changes, I decided to contruct the plot command in a loop which creates a string containing the plot command. Here it is:

levelsof partner_code, local(partners)
local plot ""
local counter = 1
foreach p in `partners'{

*disp "`p'"

if `counter' == 1{
local plot = "`plot'" + "(connect rate_ad_val year if partner_code == " + "`p'" + ")"
local counter = 0
}

else{
local plot = "`plot'" + "||(connect rate_ad_val year if partner_code == " + "`p'" + ")"

}

*disp "`plot'"

}

since "partner_code" is a string variable, I need the local `p' to be surrounded by quotes, as in the command. I tried to add the compound double quotes (by modifying "`p'" into `"`p'"', or by adding separate + `" " "' + around the local) but with no success. Stata either sees the term " ` " as a string or retruns an error message. Does anyone know why it might not work? Maybe I am specifying something incorrectly?

Olga

↧

Multilevel Longitudinal Analysis

May 15, 2017, 10:32 am

≫ Next: Collapse multiple observations to unique observation with subtotaling other variables

≪ Previous: Failure to include double in local macro, even when using compound double quotes

Hi,

I have daily performance data of different departments in different organizations. I want to see how new hires in the department influence the performance of that department. Here is a small sample of my data that shows the daily department performance for 3 departments in 3 different organizations. I know the date a new person joined the department. The new_hire variable would be 1 if on that date a new employee joined otherwise it would be 0.

I want to know whether a new team member can influence performance of the department. But, I am not sure how to evaluate this question in this multilevel longitudinal data. I really appreciate your help. ndate is the numerical equivalence of the date variable.

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input int organization long department float(ndate performance new_hire)
1004 1 20545         4 1
1004 1 20546         5 0
1004 1 20552         4 0
1004 1 20553       4.5 0
1004 1 20557         4 0
1004 1 20560         5 0
1004 2 20545       4.5 0
1004 2 20546 4.6666665 0
1004 2 20551         4 0
1004 2 20552         5 1
1004 2 20555         4 0
1004 2 20556         4 0
1004 2 20557         4 0
1004 2 20558         4 1
1004 2 20559         4 0
1004 2 20560       4.5 0
1004 2 20561       4.5 0
1004 3 20545         5 1
1004 3 20547         5 0
1004 3 20549         5 0
1004 3 20551       4.5 0
1004 3 20552         5 0
1004 3 20557      4.75 1
1004 3 20558 4.6666665 0
1004 3 20560         5 0
1005 1 20546         2 0
1005 1 20548         2 0
1005 1 20554         2 1
1005 1 20557         3 0
1005 1 20558         2 0
1005 1 20560         2 0
1005 1 20561         2 0
1005 2 20546         5 1
1005 2 20547         5 0
1005 2 20548         5 0
1005 2 20549         4 0
1005 2 20552         5 0
1005 2 20553         5 0
1005 2 20554         5 0
1005 2 20555         5 0
1005 2 20556         5 1
1005 2 20557         5 0
1005 2 20558 1.6666666 0
1005 2 20560         5 0
1005 3 20545         5 0
1005 3 20546       2.5 0
1005 3 20548         3 0
1005 3 20549         1 0
1005 3 20550         1 0
1005 3 20551 1.3333334 0
1005 3 20553         5 1
1005 3 20554         1 0
1005 3 20555       2.5 0
1005 3 20556         1 0
1005 3 20558         1 0
1005 3 20560         3 0
1005 3 20561         3 1
1006 1 20545         4 0
1006 1 20546         4 0
1006 1 20547         5 0
1006 1 20549         4 0
1006 1 20552         5 0
1006 1 20554         3 0
1006 1 20557         5 0
1006 1 20559         5 0
1006 1 20560         4 1
1006 2 20545         4 0
1006 2 20546         4 0
1006 2 20549         4 0
1006 2 20550         4 0
1006 2 20553         4 0
1006 2 20557         4 0
1006 2 20559         4 0
1006 2 20560         4 0
1006 2 20561         4 1
1006 3 20545         4 0
1006 3 20546         4 0
1006 3 20547         4 0
1006 3 20551         4 0
1006 3 20552         4 0
1006 3 20554         4 0
1006 3 20558         4 0
1006 3 20560         4 0
1006 3 20561         4 1
end
format %td ndate

↧

Collapse multiple observations to unique observation with subtotaling other variables

May 15, 2017, 10:37 am

≫ Next: import excel file with space in name of file or folder

≪ Previous: Multilevel Longitudinal Analysis

Hello, my data looks like this

School District	Revenue
School1	10
School1	15
School1	30
School2	10
School2	20
School2	40

and I want to collapse School District to a unique observation with all the revenues combined (i.e., 10+15+30 for School1). So I want my data looks like this:

School District	Revenue
School1	55
School2	70

Thank you and please help me with the code.

↧

import excel file with space in name of file or folder

May 15, 2017, 11:30 am

≫ Next: unusual characters

≪ Previous: Collapse multiple observations to unique observation with subtotaling other variables

I tried forward slashes and single and double quotes in all possible combinations, but I still can't get Stata (v 14) to import an excel file that resides in a folder with a space in the name - or a files that itself has a space in the name. Any suggestions?

Thanks,

Al Feiveson

↧

unusual characters

May 15, 2017, 11:31 am

≫ Next: predict value for f(x)

≪ Previous: import excel file with space in name of file or folder

Dear Readers

I havve some unusual characters in a data file that I want to get rid of (Stata 14.2). I

have tried various things including

charlist varname
. ret list

macros:
r(chars) : "'0123456789��"
r(sepchars) : "' 0 1 2 3 4 5 6 7 8 9 � � � � "
r(ascii) : "39 48 49 50 51 52 53 54 55 56 57 128 153 157 226 "

* ascii codes 128-226 are the problems.

. foreach c in 128 153 157 226 {
2. di uchar(`c')
3. }

â

*these are not the offending characters (º, o, º, ’, ”, ', ’, and similar). I have tried

local char1 = char(157)
replace varname = subinstr(varname, "`char1'", "", .)

* but this does not work (itreplacse olts of things but not any of the offending characters

* I cleared everything from memory and tried

. clear

. drop _all

. des

Contains data
obs: 0
vars: 0
size: 0
Sorted by:

. unicode analyse "D:\india\banerjee&iyer\gis\DAMS\temp.dta"
analyse is an invalid unicode subcommand
unicode syntax is
unicode analyze filespec
unicode encoding set encoding
unicode translate filespec [, ...]
unicode retranslate filespec [, ...]
unicode restore filespec [, ...]

analyze and [re]translate can handle Stata datasets as well as text files such as do-files, ado-files, help files, etc.

There must be no data in memory. See help unicode.
r(197);

.

Help please

Best

Richard

↧

predict value for f(x)

May 15, 2017, 1:10 pm

≫ Next: Beta regressione: parameters interpretation

≪ Previous: unusual characters

Hello, I have a non linear function, f(x), that I estimated with plreg. I'd like to know the value of the function for an especific value not included in the sample, x=5. I tried with mipolate, but still don't know how to estimate the value. Any suggestions?
Thank you

↧

Beta regressione: parameters interpretation

May 15, 2017, 1:21 pm

≫ Next: Batch mode (Windows) log name inconsistencies

≪ Previous: predict value for f(x)

Hi all.
I contact you because I have a problem with the interpretation of a beta regression.

i read different work about the regression with a dependent variable bounded between 0 and 1.
I use three different models to my data:

Linear Model FGLS
Logit Model (in this case, I trasform my dependent variable in a dummy variable)
Beta regression model.

All the models used are a good fitting to data, but I think that the best one is the beta regression model.
My problem is that I don't understand how I have to interpret the coefficient of the output of betareg Stata command and how to use post estimation commands.
I'm a new Stata user and I became to study non-linear models recently.

↧

Batch mode (Windows) log name inconsistencies

May 15, 2017, 2:23 pm

≫ Next: SEM: Can I limit the covariances that are shown?

≪ Previous: Beta regressione: parameters interpretation

I'm using python's multiprocessing and subprocess modules to spawn multiple stata processes in parallel to speed up running a general sequence of scripts on several different datasets. This has worked quite well, but I've noticed some odd behaviour in the log names that I was wondering about.

According to Method 2 in http://www.stata.com/support/faqs/windows/batch-mode/ , if I cd to a location D:/SomeDirectory/SubDirectory and run

Code:

StataMP-64 /e do script.do arg1 arg2 arg3

(I have Stata in my path) it should run script.do with the arguments arg1 arg2 arg3 and route the output to script.log

I've noticed, however, that if one of the arguments is a location and has slashes (either / or \), e.g. I run

Code:

StataMP-64 /e do script.do data1 //fake/path/here fmm

instead of routing the output to 'script.log', it routes it to 'here fmm.log' (i.e. <first word after the last slash><space><subsequent arguments separated by spaces>.log). The program runs fine and the arguments are parsed correctly internally (i.e. `1' contains data1, `2' contains //fake/path/here and so on), so I've been getting around the problem by creating folders for each combination of arguments and explicitly writing to a log inside each folder, but I wondering if anyone had encountered this problem before and found a possible solution to the weird log names. Not sure whether this is a Stata specific issue or an issue with the way CMD/Powershell parses command line arguments.

↧

SEM: Can I limit the covariances that are shown?

May 15, 2017, 2:35 pm

≫ Next: Using tabout function and want to concatenate the frequency column and column percentages

≪ Previous: Batch mode (Windows) log name inconsistencies

When I use the SEM builder estimation tool, it gives the covariance of every single variable in my model. Is there anyway I can get STATA to not show the covariances of certain variables?

↧

Using tabout function and want to concatenate the frequency column and column percentages

May 16, 2017, 7:08 am

≫ Next: compare means of clustered data

≪ Previous: SEM: Can I limit the covariances that are shown?

Dear Stata Users,

I am using the tabout function. My goal is to produce one column rather than two. More specifically, I would like to have the frequency and then the column percentages in parentheses.

To provide an example:

N Col% -> N (Col.%)
5 12.5% 5 (12.5%)

Thanks in advance for any help or suggestions,
Elizabeth

↧

compare means of clustered data

May 16, 2017, 7:32 am

≫ Next: Interpreting a Local Counter with multiple loops

≪ Previous: Using tabout function and want to concatenate the frequency column and column percentages

I have completed a mouse study and have compared mean pup birth weight among pups born to moms that were in one of four treatment groups using pairwise t-tests as follows:

ttest pup_avg if group==1 | group==3, by(group)

I created a variable that averaged the pups born to a treated mom (pup_avg).

However, I need to account for clustering of pups born to the same mom in the same pregnancy.

Each mom has her own ID (variable = mouse_id) and within each mom's observation are variables for each pup weight (weight1, weight2, weight3, etc).

Any advice on how I can compare mean pup birth weight between two different treatment groups and account for clustering within the same mouse_id?

Thanks so much.

↧

Interpreting a Local Counter with multiple loops

May 16, 2017, 7:53 am

≫ Next: Weak instruments few clusters small panel data

≪ Previous: compare means of clustered data

Hi,

I am trying to understand another person's do-files in order to then replicate some regressions, and I am a little confused about how to interpret the results of certain lines of code provided below.

Code:

* Globals:
//    gl type_data "Sub All"
    gl type_data "All"
* Locals:
    local files General_Demographics Health Income_and_Consumption Employment Employment_1   Confidence  Social_Norms  Literacy_and_Education  Political_Participation Enrollment

local counter = 1    
    foreach filename of local files {
        foreach spec in "S5_W" "S5_NoW" "S6i" "S6ii" "S6iii"{
        
            foreach z of local var`counter' {
                        
            * Specify which dataset we will use
                if "`type'"=="All"{
                    use "$output/Selection_Analysis_All.dta", clear
                }
                if "`type'"=="Sub" & ("`spec'"=="S5_W" | "`spec'"=="S5_NoW") {
                    use "$output/Selection_Analysis_`type'_S5.dta", clear
                }
                if "`type'"=="Sub" & ("`spec'"=="S6i" | "`spec'"=="S6_Ci") {
                    use "$output/Selection_Analysis_`type'_S6i.dta", clear
                }
                if "`type'"=="Sub" & ("`spec'"=="S6ii" | "`spec'"=="S6_Cii") {
                    use "$output/Selection_Analysis_`type'_S6ii.dta", clear
                }
                if "`type'"=="Sub" & ("`spec'"=="S6iii" | "`spec'"=="S6_Ciii") {
                    use "$output/Selection_Analysis_`type'_S6iii.dta", clear
                }* REGRESSIONS
                    if "`spec'"=="S5_W"  {
                        if `counter'==10 {
                            eststo `z'_panel1: areg `z' NVH initial_offer [pw=wt], a(cluster_code_new) cl(cluster_code_new)
                        }
                        else if `counter'!=10{
                            eststo `z'_panel1: areg `z' NVH [pw=wt], a(cluster_code_new) cl(cluster_code_new)
                        }

Description:

-There are 10 local files( listed in the code) which refer to various indices of outcomes.
-There are two types of subsamples that the global refer to: All data, subset of data
-There are 5 specifications, and each one is run on each of the 10 indices.

I am confused about how to read the local counter and what the value of local counter==10 actually indicates

Does one iteration mean: Data type1 (Sub), Local file1 (General Demographics), and Spec 1 (S5_W) and thus the 10th iteration (if `counter'==10) would be data type 1(sub), local file 2 (health), and specification 5 (S6iii)?

Thanks in advance, and please let me know if I can provide more information/code.

↧