Truthfulness Stance Annotation

Thank you for participating in truthfulness stance annotation!

UT Arlington
Informed Consent Document

CONSENT

By clicking “Accept’’ below, you confirm that you are 18 years of age or older and have read or had this document read to you. You have been informed about this study’s purpose, procedures, possible benefits and risks, and you have received a copy of this form. You have been given the opportunity to ask questions before you click “Accept’’, and you have been told that you can ask other questions at any time.

You voluntarily agree to participate in this study. By clicking “Accept’’, you are not waiving any of your legal rights. Refusal to participate will involve no penalty or loss of benefits to which you are otherwise entitled. You may discontinue participation at any time without penalty or loss of benefits, to which you are otherwise entitled.

PRINCIPAL INVESTIGATOR

Chengkai Li
Associate Professor
Department of Computer Science and Engineering
(817) 272-0162
cli@uta.edu

STUDENT PERSONNEL

Zhengyuan Zhu
Ph.D. Candidate Student
Department of Computer Science and Engineering
(682) 259-5848
zhengyuan.zhu@mavs.uta.edu

IMPORTANT INFORMATION ABOUT THIS RESEARCH PROJECT

The research team above is conducting a research study that aims to collect human annotations that will be used to train a machine learning model for measuring the truthfulness stance that depicts whether a social media post believes a factual claim is true, false, or expresses neutral stance. Specifically, you will be provided with pairs of sentences to annotate. Each pair consists of a factual claim from PolitiFact and a tweet from Twitter. You will be asked to decide the truthfulness stance of a tweet toward a factual claim. For that, you choose one of the five options: The tweet believes the factual claim is false; The tweet expresses a neutral or no stance toward the factual claim’s truthfulness; The tweet believes the factual claim is true; The tweet and the claim discuss different topicsThe tweet discusses unrelated topic to the claim; The tweet-claim pair is problematic; Skip this pair. The machine learning model can become useful in automating fact-checking. Imagine if a fact checker wants to understand public opinion and reaction toward a factual claim. When a Twitter account posts a tweet that is related to a factual claim, the machine learning model can help identify whether the tweet is believe the factual claim is true or not. You can choose to participate in this research study if you are over the age of 18 and fluent in English. Please note that you are not eligible for this study if you are not an English fluent speaker or are younger than 18. You might want to participate in this study if you would like to get a taste of how technology might help fact-checkers understand the spreading of factual claims and familiarize yourself with the fact-checking process. However, you might not want to participate in this study if you do not have the time or are not interested in the fact-checking process. This study has been reviewed and approved by an Institutional Review Board (IRB). An IRB is an ethics committee that reviews research with the goal of protecting the rights and welfare of human research subjects. Your most important right as a human subject is informed consent. You should take your time to consider the information provided by this form and the research team and ask questions about anything you do not fully understand before making your decision about participating.

TIME COMMITMENT

If you decide to participate in this study, you should know that you have the full freedom to decide the duration of your participation. From a couple of minutes to as many times as you want until the end of the study.

PROCEDURES

If you decide to participate in this research study, this is the list of activities that we will ask you to perform as part of the research: 1. Go to https://idir.uta.edu/stance_annotation and click on the “Sign Up” button at the top right corner. 2. This will open this consent form that you need to accept if you want to participate in the study. If you decide that you do not wish to participate, click the “Decline” button, or close your web browser. 3. If you click the “Accept” button you will be presented with the account registration form to create an account by providing a username, an email address, and a password. Note that only the first three letters of your username will be visible to other participants in this study. We will use your email address only to communicate with you regarding this study. We will delete your username, email, and password once we have concluded the study. 4. We will send you a verification email. Once you click the verification link in the email, your account will be activated. If you cannot find it please check your junk/spam email box. 5. Using the registered account, you will log into our data annotation website to the instructions page. Read the instructions carefully and continue to the annotation page. On the top right corner of the webpage, there is a link to the instructions page if you need to reread the instructions at any time. 6. On the annotation page, you will be asked to read a pair of sentences (a factual claim and a tweet). Your task is to decide what is the truthfulness stance of the tweet toward the factual claim. For that, you should choose one of the six options: The tweet author believes the factual claim is false; The tweet author expresses a neutral or no stance toward the factual claim’s truthfulness; The tweet author believes the factual claim is true; The tweet and the claim discuss different topics; Skip this pair. The data annotation website will record your choices as well as the timestamp of each interaction with the website. 7. After submitting your choice, a new pair of sentences will appear for annotation. This process will continue until you decide to stop by clicking the “Log Out” button or by closing the web browser. 8. If you want to modify the choice you made on a previous pair, click “Modify My Previous Responses”, where you can see all pairs on which you provided annotations, ordered by the timestamps. After your modification, your work quality score may become different. 9. You will be asked to complete 40 training annotations. Upon submitting your answer to a training example, the website will display a message indicating whether the answer is correct or not and provide a justification for the correct answer. After the first 50 annotations, the “Leaderboard” button will be activated. By clicking it, a pop-up window will show your work quality score, total points, and where you stand among other participants. Only the first three letters of usernames (instead of full names) will be displayed on the leaderboard. The leaderboard calculation is updated every 15-30 pairs reflecting your new work quality score and total points. To gauge the quality of your choices, we use several “gold standard” sentence pairs, for which we have “correct choices” selected by research experts on the subject. Your choices will be compared with the “gold standard” choices to estimate your work quality level. There is no visual distinction between a pair from the “gold standard” set or a pair outside that set. Below is a list of tips for improving your work quality score. You can find a copy of this list under each annotation pair. (1) Carefully examine each pair of factual claim and tweet. (2) Contextual information (such as the fact-check summary, claimant information, hyperlink title and content) may help you form answers. (3) Review the instructions to understand the examples. (4) Don't guess. Skip the pairs that you are not sure about. (5) Modify previous responses if necessary. (6) You may be tempted to pick easy/short claims to work on by clicking "Skip this pair". Keep in mind that our work quality calculation formula has a component that accounts for the length/complexity of claims as well as how many pairs are skipped. We discourage excessive skipping. Nevertheless, if you are not confident about a question, it is still better to skip, because every single mistake will lower your work quality score. Whenever you make one mistake, our algorithm lowers your work quality score, which means you get less points for every pair you have annotated. It takes multiple correct answers to make up for every single mistake and get the work quality score back to the previous value. If your current work quality score is 0 or very low, it is because our algorithm detected many mistakes in your answers. The best thing to do is to review your answers and modify them if necessary. If your work quality score is 0, it might actually be negative internally. If you continue to answer new questions, it will take MANY questions before you can see positive and improving work quality score. If you have labeled 50-150 pairs, the work quality score based on the small sample may not reflect your true work quality. It will become more robust once you have labeled more pairs. We may email you about our optional data annotation training workshops in our Lab (ERB - Room 414) or online through MS Teams, which are available to all participants. The purpose of the workshops is to review the information provided on the instructions page, discuss any questions you might have, and annotate some pairs. We expect this activity to be helpful in improving annotation quality. Note that other participants may also be present in the data annotation workshops.

POSSIBLE BENEFITS

You will get a taste of how technology might help fact-checkers understand the spreading of factual claims. AI technologies will be used for processing the tweet and claim text and extracting information from the processed text. We will also use AI technology to model the truthfulness stance detection based on the annotations from subjects. Furthermore, the research outcome might lead to advancement in data-driven fact-checking, which tackles an important sciential challenge.

POSSIBLE RISKS/DISCOMFORTS

If the data is lost or stolen, you may be exposed as a research subject in the study. In addition, there can be a risk of undue influence because a professor figure is recruiting a student. Since there is an authority difference between these two parties, it is possible that you might feel compelled to participate in the research against your best interests. You may worry about your grades, future research opportunities, etc. if they decline. There may be psychological risks because you may be affected by the factual claims’ and tweets’ content. A sizable proportion of factual claims is misinformation. You may potentially misunderstand and consider misinformation as fact. Furthermore, although highly rare given the way we collected tweets for the study, it is possible for a tweet to contain filthy language, profanity, and hate speech. You may feel anger or embarrassment during the annotation. This research study is not expected to pose any additional risks beyond what you would normally experience in your regular everyday life. However, if you experience discomfort, please inform the research team, and quit the study without any consequence to you. To minimize the risk to privacy or confidentiality, we are storing the data on UTA servers and will limit its access to the research team only. To minimize the risk of undue influence we confirm your decision to participate or not participate will not influence your grades or future research opportunities in any way. To minimize the risk of psychological influence, we have removed tweets that contain images, GIFs, or videos.

Compensation

You will not be compensated for your participation.

ALTERNATIVE OPTIONS

There are no alternative procedures offered for this study. However, you can elect not to participate in the study or quit at any time at no consequence.

CONFIDENTIALITY

The research team is committed to protecting your rights and privacy as a research participant. All paper and electronic data collected from this study will be stored in a secure location on the UTA campus and/or a secure UTA server for at least three (3) years after the end of this research. The results of this study may be published and/or presented without naming you as a participant. The data collected about you for this study may be used for future research studies that are not described in this consent form. If that occurs, an IRB would first evaluate the use of any information that is identifiable to you, and confidentiality protection would be maintained. While absolute confidentiality cannot be guaranteed, the research team will make every effort to protect the confidentiality of your records as described here and to the extent permitted by law. In addition to the research team, the following entities may have access to your records, but only on a need-to-know basis: the U.S. Department of Health and Human Services and the FDA (federal regulating agencies), the reviewing IRB, and sponsors of the study.

CONTACT FOR QUESTIONS

Questions about this research study may be directed to Dr. Chengkai Li at (817) 272-0162 or cli@uta.edu and Zhengyuan Zhu at zhengyuan.zhu@mavs.uta.edu. Any questions you may have about your rights as a research subject or complaints about the research may be directed to the Office of Research Administration; Regulatory Services at 817-272-3723 or regulatoryservices@uta.edu.

CONSENT