Seedcorn Grant Report by Kristina Petkova
18.01.2019, by Tina Keil in grant report
Bulgarian Academy of Sciences, Bulgaria; Project: The Challenged Identity. Struggles around the Council of Europe Convention
Thanks to the European Association of Social Psychology I received a Seedcorn grant in May 2018. The grant supported a research project which I carried out together with Valery Todorov (Member of EASP) and Kiril Simov (Bulgarian Academy of Sciences).
The idea for this project was triggered by the controversies which had been taking place in Bulgaria with relation to the Council of Europe Convention on preventing and combating violence against women and domestic violence, known as Istambul Convention (IC).
We believed that exploring the content of the arguments of the opposing parties, identifying the “tricks” they use to persuade the audience, and tracing the sources triggering the confrontation would reveal the underlying causes for the debate.
Our expectations were that the arguments in the debate would a) boil down to the issue of identity threat (Branscombe, Ellemers, Spears, & Doosje, 1999); (Ellemers, Spears, & Doosje, 2002) b) pertain to the raising anti-liberal trends in a number of European countries, and c) have geopolitical dimensions.
The methodology we applied to test the hypotheses was content analysis of texts related to the debate. We used Internet sites of news media, newspapers TV channels, political parties, NGO’s, Parliament with key words like “Istanbul convention”, “violence against women”, “domestic violence“. We achieved a corpus which comprise more than two million running words so that we have a representative sample. The main target period wase from the beginning of 2018 to June 30th, 2018. The plan of work comprised the following steps:
- Corpus preparation
- Metadata extraction
- Opinion annotation
- Linguistic processing of the corpora
- Vocabulary creation
- Interpretation of the co-occurrence among the concepts mentioned in one and the same document based on correlation between the annotation in the text (key words), the metadata and the opinion annotation (for or against the convention)
The initial corpus was created by a specialized news agency Media Point World. The first stage of processing of the corpus was carried out manually. Annotators randomly selected articles and extracted keywords on the basis of which the corpus was extended. Then the corpus was processed structurally (lemmatized) and converted to TEI XML representation. After this it was linguistically processed. In this way we identify the basic form of the used words. On the basis of this lemmatization and the syntactic analyses of the text additional key words and phrases were identified using statistical tools like AntConc (see Anthony, 2014). The visualization (in form of the word cloud) has been done through the following web service: https://www.jasondavies.com/wordcloud/.
The documents in the corpus were also annotated with metadata. This metadata provides information about the source of the publication, the author (if available, otherwise we consider the source as an author of the material), the organization presented by the author (whether they are members of some party or other type of organization), and the moment of publications – date and time (some time it is not possible to identify the first publisher, because only date is presented).
During the manual processing of the corpus it was identified that very often the news about IC were mixed with news about other topics. This additional not relevant texts have negative impact on the automatic extraction of the key words and phrases. That is why we extensively manually edited the documents in order to clean them from unnecessary text. In this process we carried out the first classification of the type of texts and the style of presenting the issues related to IC. Thus we could classify the documents mentioning IC in the following groups:
- Articles where IC is mentioned without any analysis or opinion about its content – for example, expressions like “”next week IC will be discussed at organization ZZZ.”
- Large articles in which only a short text mentions IC (discussed above).
- Articles where the opinions of one or several politicians or representatives of different originations are expressed or analyzed.
- Articles which represent neutral analyses of the content of IC. Usually three types of issues are presented: prevention of violence against women and of domestic violence and its importance for Bulgaria; Notion of gender as the socially constructed roles, behaviors, activities and attributes; and the role of the Group of experts on action against violence against women and domestic violence (GREVIO).
Having finished the manual annotation and identified the negative connotations in the randomly selected articles we proceeded with the development of a tool which automatically extracts the segments of texts with negative connotation in the entire corpus.
This is what we have done so far. When we obtain the data what follows will be the social psychological analysis and hypothesis testing. Apart from testing our expectations the outcomes of the project are: a) annotated corpus supporting the analysis. It can be useful for other research and longitudinal studies; b) Vocabulary of the corpus, which could be useful not only for this particular study but also for analysis of related topics.