I have a dataset of observations on children- some singletons, and some sets of 2 or 3 siblings. I want to only keep 1 of the siblings for analysis, selected randomly from the 2 or 3 in the dataset. They can be identified by having the same household ID (hhid) but have different child IDs (childid). I have identified them based on their having duplicate hhid, but I don't want to use the "duplicates drop" command because that will keep the first observation, and I would like to keep a randomly selected observation. What is the best way to do this?
↧