using TSSPELL to count job tenure in weeks

June 12, 2016, 8:57 am

≪ Previous: Replacing value from adjacent column

Hi everyone--

I have a large data set consisting of labor force outcomes for 9000 individuals reported on a weekly basis for roughly 15 years. I am looking to create a variable that counts the number of weeks an individual works with the same employer (job tenure), not including unemployment/dropping out of the labor force weeks in between working for the same employer. The current variable I have (EMP_STATUS_) reports the unique employer ID (denoted in this example as 199701) as well as any unemp/out of LF spells (denoted as -4), so it looks kinda like this:

WEEK EMP_STATUS_ JOBTENURE

1 199701 1

2 199701 2

3 199701 3

4 199701 4

5 199701 5

6 199701 6
7 -4 .
8 -4 .
9 -4 .
10 -4 .
11 199701 7
12 199701 8
13 199701 9

In this case, since the individual returned to their job, i would like this new job tenure variable to be able to recognize this and continue counting the number of weeks even after the unemployment spell. I am looking to ultimately have a cumulative count of the total number of weeks an individual holds the same job (no matter consecutively or otherwise).

let me know if you have any suggestions!

Thank you,

Rebecca

↧

ivprobit versus ivreg2

June 12, 2016, 9:15 am

≫ Next: Nonfatal error r(111) in a constrained maximum likelihood estimation

≪ Previous: using TSSPELL to count job tenure in weeks

I have a binary Y, continuous X (and many other controls), and binary Z. The IV coefficient when I use ivprobit is 5 times as high (and the marginal is greater than 1), compared to the IV coefficient when I use ivreg2. Why can this be the case? Which one is more likely to be correct? My binary Y variable is only equal to 1 in 4% of the cases, 0 otherwise.

↧

Nonfatal error r(111) in a constrained maximum likelihood estimation

June 12, 2016, 9:27 am

≫ Next: How to get the non central parameter np for using invnFtail

≪ Previous: ivprobit versus ivreg2

I have recently programmed a maximum likelihood estimation command that performs heckman estimations amongst others for data with corner solutions. I wanted to show that running this estimate with the correlation parameter constrained to zero is equivalent as running churdle. I thus set the constraint

Code:

const 1 _b[athrho:_cons] = 0

And then run the command with the option const(1). I get the right results but the estimation throws the following message

Code:

(note: constraint number 1 caused error r(111))

before presenting the estimates.

I was hoping you would know why, or when, or how is it that maximum likelihood throws this error.

Thanks.

↧

How to get the non central parameter np for using invnFtail

June 12, 2016, 12:44 pm

≫ Next: Year effects in FE model

≪ Previous: Nonfatal error r(111) in a constrained maximum likelihood estimation

Dear Statalisters,

I am implementing the method of Hurlin 2004 Testing Granger Causality in Heterogeneous Panel Data Models with Fixed Effects coefficients. I need to get the critical values of the non centered Fisher distribution. However, I have no idea how to determine how non centered is my distribution (referred in stata as np). Does anyone know how to get this parameter?

Best regards,

Laura R

↧

Year effects in FE model

June 12, 2016, 12:48 pm

≫ Next: Lag variable in FE model

≪ Previous: How to get the non central parameter np for using invnFtail

Hello,

I am running a fixed effect mode, this is , i was running a two-way fixed effect mode, until I verifyed that when i introduce the i.year option, all the results get really inconsistent with the empirics that i am following.

Should i use only a one way FE instead, "just" because of this?

Thank you!

↧

Lag variable in FE model

June 12, 2016, 12:51 pm

≫ Next: New in SSC: -fqreg- Quantile regression for non-negative data with a mass-point at zero and an upper bound

≪ Previous: Year effects in FE model

Hello,

I am constructing a FE model and i want my dependet variable lagged . in the right side of the regression
I read that , FE it is not consistent with this lag, and that i shoul regress a GMM Arellando Bond , having the lagged dependent variable as an intrument.
Is it true?

Thank you!

↧

New in SSC: -fqreg- Quantile regression for non-negative data with a mass-point at zero and an upper bound

June 12, 2016, 12:56 pm

≫ Next: Shorten Variable Labels

≪ Previous: Lag variable in FE model

With the usual thanks to Kit Baum, -fqreg- is now available in SSC.

-fqreg- estimates quantile regression for non-negative data with a mass-point at zero and an upper bound (e.g., fractional data with a mass-point at zero), using the specification and method described in Machado, Santos Silva, and Wei (2016).

Please do let me know if you have any questions or comments.

Cheers,

Joao

↧

Shorten Variable Labels

June 12, 2016, 2:48 pm

≫ Next: Prvalue

≪ Previous: New in SSC: -fqreg- Quantile regression for non-negative data with a mass-point at zero and an upper bound

saveold for STATA 14 data files to STATA 13 only works if the variable labels are less than 80 characters.
Here is a way to truncate the labels, this code shortens all variable labels to 79 characters.
Note: there are single and double quotes in use.

foreach i of varlist _all {
local longlabel: var label `i'
local shortlabel = substr("`longlabel'",1,79)
label var `i' "`shortlabel'"
}

↧

Prvalue

June 12, 2016, 3:26 pm

≫ Next: Strings loosing precision in STATA

≪ Previous: Shorten Variable Labels

I have tried installing prvalue in my stata 13. no luck. is it obsolete, does it have a new version? it is for predicted values of the ordinal logit model.
thank you

↧

Strings loosing precision in STATA

June 12, 2016, 3:36 pm

≫ Next: -unzipfile- unzipping zipx files: could not perform unzip r(601)

≪ Previous: Prvalue

Hi all,

I am working with large data set with more than 6000 obs. Currently, I am having difficulty in sorting out why string variable is loosing precision - the variable of my interest is string (v2) because it includes entries mixed with special characters and numbers.

Issue 1: when i import "X_ar" file, variable v2 loose precision - I want to import "v2" variable as it is in in excel file X_ar
Issue 2: when i export imported data set into excel, again it gives me output of variable v2 without precision - I want v2 variable (exported in excel file "X") same as it is in file X_ar. I also think if we are able to solve out issue 1, most probably issue 2 will be resolved automatically

Please the excel and dofile attached.

Note: I am using STATA 12

↧

-unzipfile- unzipping zipx files: could not perform unzip r(601)

June 12, 2016, 4:52 pm

≫ Next: lexicographical sorting of strings

≪ Previous: Strings loosing precision in STATA

I am trying to unzip a set of zip files using -unzipfile- in a loop and all files unzip successfully except one. I found out that one file is actually a zipx file. Why I try to unzip this problematic file with the trace on, I get the output below. 7-Zip or Windows can open the file. I am wondering if this is a compatibility issue that is specific to the program that Stata uses for unzipping, and if I should report it somehow.

Code:

. unzipfile 199703.zip, replace
-------------------------------------------------------------------- begin unzipfile ---
- version 11.0
- syntax anything[, replace]
- gettoken ZipFileName rest : anything
- if (`"`rest'"' != "") {
= if (`""' != "") {
  di as error "invalid syntax"
  exit 198
  }
- if (`"`replace'"' != "") {
= if (`"replace"' != "") {
- local overwrite "overwrite"
- }
- mata : zipfile_cmd()
    error unzipping file: 9703.dat
could not perform unzip
---------------------------------------------------------------------- end unzipfile ---
r(601);

↧

lexicographical sorting of strings

June 13, 2016, 7:35 am

≫ Next: confusing fixed effects results - using "xi i."

≪ Previous: -unzipfile- unzipping zipx files: could not perform unzip r(601)

Hi,

I have a string variable called "name". For this variable, each observation contains several words (for example: John Anthony Smith). I would like to order the words of each observation according to the lexicographical order (for example: John Anthony Smith becomes Anthony John Smith).

The only solution I have found is to separate each word of the string into a specific variable, do a reshape (wide to long), do a sort, and reshape from large to wide. On big data set it takes a lot of time. Is there a fastest way?

Thank you very much!

Antoine

↧

confusing fixed effects results - using "xi i."

June 13, 2016, 7:38 am

≫ Next: Carryforward option, problems with missing observations

≪ Previous: lexicographical sorting of strings

I have a data set in which observations are grouped by industry and possess a year variable as well. For the purposes of my analytics I need to determine the yearly fixed effects with respect to several variables when grouped by industry. I have used the following code to do so:

local varlist "x y z"

foreach v in `varlist' {
use masterfile.dta, clear
xi: statsby _b, by(industry_grp) clear: reg `v' i.year
}

(Note: x, y, z are placeholders)

The issue that I am running into is that with one of my variables the fixed effects regressions are producing results for all but one year in my data set - 2013. For this year, the betas listed are all missing with the exception of one or two, which are 0. I have checked and while there are missing values for this variable in 2013, there are also missing values for it in other years as well (just the nature of the data). And there doesn't appear to be a disproportionately large number of missing values in 2013 relative to other years. Furthermore, I don't appear to be having issues with any of the other variables for year 2013.

I am stumped. Not having these values throws a wrench in later analytics, so it would be great to have them. But I have no idea what is going wrong (would love an answer to that) or how to fix my code so as to avoid generating the missing values. Any help would be greatly appreciated!

Thanks in advance,

Dave

↧

Carryforward option, problems with missing observations

June 13, 2016, 8:23 am

≫ Next: Inteff - categorical by continuos interaction in logistic regression

≪ Previous: confusing fixed effects results - using "xi i."

Hi.

I have some dummy variables in my dataset until year 2006. I assume that these dummies have the same values in 2007.
Then, I count how many observations I have in 2006 to add the same in 2007. Let's say I had 100 in 2006 and in general 2000 obs. Then, I use ''set obs 2200'' and the following commands:
set obs 2200
replace year = 2007 in 2200
fillin year origin dest
drop _fillin

drop if country1=="" | country2==""
egen id=group(country1 country2)
tsset id year
bysort id: carryforward var1-var2, replace
drop if var1==var2
But when I sort then variable year, I see that I have a lot of missing values for the ''year'' variable eventhough the range of var1-var2 is filled in for 2007.
Any suggestion why is this happening?
Moreover, do you know how could I carryforward the values of dummies of 2006 for 7 years (until 2013) automatically?

thanks

↧

Inteff - categorical by continuos interaction in logistic regression

June 13, 2016, 8:52 am

≫ Next: Manual text on plot for margincontplot

≪ Previous: Carryforward option, problems with missing observations

HI,
I am dealing with the command inteff for the estimation of the interaction effect in a logit model. I am testing a categorical by continuous interaction but I got a conformability error r(503).

Does any of you could help me with this issue?

Thank you

↧

Manual text on plot for margincontplot

June 13, 2016, 9:29 am

≫ Next: DID with Treatment at Multiple Time Periods

≪ Previous: Inteff - categorical by continuos interaction in logistic regression

Hi, I'd like to insert text onto a plot using plotopts(twoway_options) feature in mcp. E.g.

sysuse auto, clear
gen inv_price = 1/price
logit foreign inv_price
sum price
range w1 r(min) r(max) 20
gen inv_w1 = 1/w1
mcp price (inv_price), var1(w1 (inv_w1)) show ///
plot(text(0.35 5000 "hello", place(e)))

This code produces the error "invalid point, hello". Does anyone know why this isn't working?

Thanks.

↧

DID with Treatment at Multiple Time Periods

June 13, 2016, 10:30 am

≫ Next: Problem with twoway aspect ratio and long titles

≪ Previous: Manual text on plot for margincontplot

I am performing DID with a panel data. I don't have a specific time period for the treatment but some of the units receive treatment in one period but not in other and may receive the treatment again in some other time period. Moreover, there are 3 types of treatments, so three treatment groups and one control group. I am using the following model to estimate DID.

Where gamma is for the time fixed effects, C_it shows the choice (treatment) and G_it shows the treatment group.

I have two questions here. Is this model correctly specified given the above explained situation?

Secondly, I have the data for 269 banks form seven countries. Do I need to incorporate the country dummies as well?

↧

Problem with twoway aspect ratio and long titles

June 13, 2016, 11:42 am

≫ Next: Time Data Variable

≪ Previous: DID with Treatment at Multiple Time Periods

Hi there,

There is a problem when you use the twoway suboption "aspectratio(#)" and have a long title in the graph.

What happens is the title seems to displace the y axis further to the left than needed.

Easy to reproduce:

Code:

sysuse auto, clear
scatter price mpg, title("Long Long Long Long Long Long Long Long Long  title") aspectratio(1)

You end up with a graph like the one attached
problem graph.pdf

Compare that with:

Code:

sysuse auto, clear
scatter price mpg, aspectratio(1)

I already contacted stata support and will report what they have to say, but if anyone knows a workaround that would be great!

↧

Time Data Variable

June 13, 2016, 2:00 pm

≫ Next: Dfactor model with missing values/unbalanced panel?

≪ Previous: Problem with twoway aspect ratio and long titles

I have a date variable 'Month" that has YYYYMM as values in numeric forms. E.g. 199901 for Jan 1999, 199902 for Feb 1999... How do i get stata to recognize them as monthly data and tsset appropriately? I have tried
gen double eventdate = date(month, "YM")

but it does not seem to work. This is a long variable

Thank you.

↧

Dfactor model with missing values/unbalanced panel?

June 13, 2016, 2:16 pm

≫ Next: to change the format of date variable

≪ Previous: Time Data Variable

I'm reproducing the Federal Reserve's LMCI metric, which is the first of 3 dynamic AR(2) factors calculated from 19 different labor market series. The methodology is laid out here and here.

The authors estimate the model beginning in 1976 but not all 19 of their series span this entire estimation window. They call this an "unbalanced panel" which they note a dynamic factor model is "well-suited for", but to prevent confusion I think it's simpler and more accurate to say that they're just estimating the model with missing data, before all 19 series are available.

So my question is simple: can Stata estimate a dfactor model with some missing series/an unbalanced panel? My impression was no.

↧