Dear All,
I am working with a dataset of tweets by politicians. For each tweet, some dummy variables - such as "inequality" and "migration" in the example below - are equal to 1 if that particular tweet deals with a specific topic - eg inequality and migration respectively. My final objective is to set up a dynamic panel model to study the relationship between politicians talking about particular subjects and their shares in present present and past opinion polls (variable "share").
My problem is that I have monthly shares, but in a month the same politicians tweeted several times. Hence I cannot xtset using "name_polit" as id and "yearmo" as time. Thus I'm stuck on how to organize the panel: if I want to know whether a politician talked about migration because his o her previous month's shares went down I need to preserve the monthly structure, but that would violate the panel structure having repeated time values within panel. I cannot use "tweet_id" either because it uniquely identifies each single tweet and therefore is not repeated over time.
I would really appreciate your help on this.
Many thanks!
Giovanni
I am working with a dataset of tweets by politicians. For each tweet, some dummy variables - such as "inequality" and "migration" in the example below - are equal to 1 if that particular tweet deals with a specific topic - eg inequality and migration respectively. My final objective is to set up a dynamic panel model to study the relationship between politicians talking about particular subjects and their shares in present present and past opinion polls (variable "share").
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input str22 name_polit str7 yearmo double share float(inequality migration) str33 tweet_id "Politician X" "2018m2" 13.1 10 "797153663818320_802734809926872" "Politician X" "2018m2" 13.1 0 0 "797153663818320_805419726325047" "Politician X" "2018m2" 13.1 0 1"797153663818320_801139120086441" "Politician X" "2018m3" 18.566666666666666 0 0 "797153663818320_806538146213205" "Politician X" "2018m3" 18.566666666666666 1 0 "797153663818320_809722572561429" "Politician X" "2018m3" 18.566666666666666 0 0 "797153663818320_806901342843552" "Politician X" "2018m3" 18.566666666666666 0 1 "797153663818320_808973869302966" "Politician X" "2018m4" 20.45 0 0 "797153663818320_828926897307663" "Politician X" "2018m4" 20.45 1 0 "797153663818320_823786454488374" "Politician X" "2018m4" 20.45 0 0 "797153663818320_819100928290260" "Politician X" "2018m4" 20.45 0 1"797153663818320_819338594933160" "Politician X" "2018m5" 22.640000000000004 1 0 "797153663818320_835309080002778" "Politician X" "2018m5" 22.640000000000004 0 1 "797153663818320_832123746987978" "Politician X" "2018m5" 22.640000000000004 0 0 "797153663818320_839839332883086" end
My problem is that I have monthly shares, but in a month the same politicians tweeted several times. Hence I cannot xtset using "name_polit" as id and "yearmo" as time. Thus I'm stuck on how to organize the panel: if I want to know whether a politician talked about migration because his o her previous month's shares went down I need to preserve the monthly structure, but that would violate the panel structure having repeated time values within panel. I cannot use "tweet_id" either because it uniquely identifies each single tweet and therefore is not repeated over time.
I would really appreciate your help on this.
Many thanks!
Giovanni