Quantcast
Channel: Statalist
Viewing all articles
Browse latest Browse all 72764

generating case counts for level 2 units in hierachical dataset in wide format

$
0
0
I have hierarchical data consisting of 3 levels (cases nested in groups and groups nested in geographic spaces). The data are in wide format where each row is a level 1 case. There are about 2 million level 1 units. At present I am trying to get a simple count of the number of level 1 units within each level 2 unit. The level 2 units are denoted by string text, there are many duplicates that are the result of some data entry error (for example, a misspelling). I know there are about 19,000 unique entries using egen's tag function (again, many of these are actually duplicates with slightly different spelling and that is the issue I am attempting to address).

Code:
egen grouptag=tag(level2var)
In going through and hand coding these 19,000 level 2 entries it is helpful to see which entries have 1 or 2 cases associated with them and which have many (say 1,000 plus) entries associated with them; the former are likely to be the result of typographical mistakes while the latter are valid names of level 2 units.

Again, using egen I can group these units using egen...
Code:
egen groups=group(level2var)
Of course I could simply do a count by each group...
Code:
bysort groups:count
but with 19,000 or more level 2 units this is too onerous.

I need a variable that contains the count of level 1 units for each unique level 2 entries. I tried the following but it does not seem to work. This is the issue.

Code:
bysort groups: egen casecount=count
Ideally I could then list the level2 entry and number of level 1 cases associated with it and put that in excel and clean it up by putting duplicate entries together under a unified code/name. That is the hope anyway.

Code:
list level2name casecount if grouptag, clean noobs
I know there is a simple way to do this.



Viewing all articles
Browse latest Browse all 72764

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>