I have a data that looks like this
NO. CODE
1 TU4E0620212748560 KEN Z0563653320130207102629PH0101207000ERRT0901205MEMB ER
2 TU4E0620213148294 Z0563653320130812115544PH0101207000ERRT0901205MEMB ER
I have about 49k observations of the above data and what I wanna do is to use my codebook to write a program that interprates the above data/code.
Main Problem 1. when I insheet the data to Stata almost all the commands I used do not work on it ...I tried to destring, recast, tostring, encode so that it becomes readable to stata but failed all.
How can I make such a data readable for the software so that I can use commands like substr, strpos etc
Main Problem 2. Each observation doesnot have the same number of codes, meaning one person had a code starting with TU.., Z0 but no KEN and another will have a different combination of such codes. WHat I wanna do is build a binary data that indicates whether a person has a particular part of codes in it. Do you have any technique that can help me create a column corresponding to each category of codes?
Best,
NO. CODE
1 TU4E0620212748560 KEN Z0563653320130207102629PH0101207000ERRT0901205MEMB ER
2 TU4E0620213148294 Z0563653320130812115544PH0101207000ERRT0901205MEMB ER
I have about 49k observations of the above data and what I wanna do is to use my codebook to write a program that interprates the above data/code.
Main Problem 1. when I insheet the data to Stata almost all the commands I used do not work on it ...I tried to destring, recast, tostring, encode so that it becomes readable to stata but failed all.
How can I make such a data readable for the software so that I can use commands like substr, strpos etc
Main Problem 2. Each observation doesnot have the same number of codes, meaning one person had a code starting with TU.., Z0 but no KEN and another will have a different combination of such codes. WHat I wanna do is build a binary data that indicates whether a person has a particular part of codes in it. Do you have any technique that can help me create a column corresponding to each category of codes?
Best,