Contribute to our data research and make a difference.
The University of Texas at Arlington (UTA)
Informed Consent for Studies with Adults
Data Annotation for Claim-Matching
RESEARCH TEAM PRINCIPAL INVESTIGATOR
Chengkai Li
Professor and Associate Chair
Department of Computer Science and Engineering
(817) 272-0162
cli@uta.edu
Theodora Toutountzi
Ph.D. Candidate Student
Department of Computer Science and Engineering
theodora.toutountzi@mavs.uta.edu
The research team above is conducting a research study that aims to collect annotated data (often called labeled data in the field of artificial intelligence and machine learning) that would be used to train a machine learning model for measuring how helpful an existing fact-check is in vetting a factual claim that was not fact-checked before. Specifically, you will be provided with pairs of sentences to annotate. Each pair consists of a fact-check published by fact-checking outlets such as PolitiFact, FactCheck.org, Washington Post Fact Checker, and Snopes and an unchecked claim derived from political debates, speeches, interviews, etc. For each pair, you will be asked to decide how helpful the fact-check is in vetting the unchecked political claim. You will continue this same process for as many available pairs as you wish.
For that, you choose one of the five options: Not at all helpful, Slightly helpful, Somewhat helpful, Very helpful, and Extremely helpful. The machine learning model can become useful in automating fact-checking. Imagine an audience of a presidential debate. When a presidential candidate repeats a factual statement that has been marked as false by professional fact-checkers, the machine learning model can help identify previous relevant fact-checks and present them to the audience. This can help mitigate the spreading of misinformation, which is a fundamental threat to the modern society.
You can choose to participate in this research study if you are over the age of 18 and fluent in English. Please note that you are not eligible for this study if you are not an English fluent speaker or are younger than 18.
You might want to participate in this study if you would like to get a taste of how technology might help vet factual statements and familiarize yourself with the fact-checking process. However, you might not want to participate in this study if you do not have the time or are not interested in the fact-checking process.
This study has been reviewed and approved by an Institutional Review Board (IRB). An IRB is an ethics committee that reviews research with the goal of protecting the rights and welfare of human research subjects. Your most important right as a human subject is informed consent. You should take your time to consider the information provided by this form and the research team and ask questions about anything you do not fully understand before making your decision about participating.
If you decide to participate in this study, you should know that you have the full freedom to decide the duration of your participation, from a couple of minutes to as long as you prefer, for as many times as you want until the end of the study.
If you decide to participate in this research study, this is the list of activities that we will ask you to perform as part of the research:
You will get a taste of how technology might help vet factual statements. Furthermore, the research outcome might lead to advancement in data-driven fact-checking, which tackles an important sciential challenge.
This research study is not expected to pose any additional risks beyond what you would normally experience in your regular everyday life. However, if you experience discomfort, please inform the research team, and quit the study without any consequence to you.
To minimize the risk of your personal information being exposed, we are storing your personal data on UTA servers, limiting its access to the research team only.
Your decision to participate or not participate will not influence your grades or future research opportunities in any way.
You will not be compensated for your participation.
There are no alternative procedures offered for this study. However, you can elect not to participate in the study or quit at any time at no consequence.
The research team is committed to protecting your rights and privacy as a research participant. All paper and electronic data collected from this study will be stored in a secure location on the UTA campus and/or a secure UTA server for at least three (3) years after the end of this research.
The results of this study may be published and/or presented without naming you as a participant. The data collected about you for this study may be used for future research studies that are not described in this consent form. If that occurs, an IRB would first evaluate the use of any information that is identifiable to you, and confidentiality protection would be maintained.
While absolute confidentiality cannot be guaranteed, the research team will make every effort to protect the confidentiality of your records as described here and to the extent permitted by law. In addition to the research team, the following entities may have access to your records, but only on a need-to-know basis: the U.S. Department of Health and Human Services and the FDA (federal regulating agencies), the reviewing IRB, and sponsors of the study.
Questions about this research study may be directed to Dr. Chengkai Li at (817) 272-0162 or cli@uta.edu and Theodora Toutountzi at theodora.toutountzi@mavs.uta.edu. Any questions you may have about your rights as a research subject or complaints about the research may be directed to the Office of Research Administration; Regulatory Services at 817-272-3723 or regulatoryservices@uta.edu.
By clicking “Accept”, you are confirming that you understand the study’s purpose, procedures, potential risks, and your rights as a research subject. By agreeing to participate, you are not waiving any of your legal rights. You can refuse to participate or discontinue participation at any time, with no penalty or loss of benefits that you would ordinarily have. Please click “Accept” if you are at least 18 years of age and voluntarily agree to participate in this study.
UT Arlington
Informed Consent Document
By clicking “Accept’’ below, you confirm that you are 18 years of age or older and have read or had this document read to you. You have been informed about this study’s purpose, procedures, possible benefits and risks, and you have received a copy of this form. You have been given the opportunity to ask questions before you click “Accept’’, and you have been told that you can ask other questions at any time.
You voluntarily agree to participate in this study. By clicking “Accept’’, you are not waiving any of your legal rights. Refusal to participate will involve no penalty or loss of benefits to which you are otherwise entitled. You may discontinue participation at any time without penalty or loss of benefits, to which you are otherwise entitled.
Chengkai Li
Associate Professor
Department of Computer Science and Engineering
(817) 272-0162
cli@uta.edu
Zhengyuan Zhu
Ph.D. Candidate Student
Department of Computer Science and Engineering
(682) 259-5848
zhengyuan.zhu@mavs.uta.edu
The research team above is conducting a research study that aims to collect human annotations that will be used to train a machine learning model for measuring the truthfulness stance that depicts whether a social media post believes a factual claim is true, false, or expresses neutral stance. Specifically, you will be provided with pairs of sentences to annotate. Each pair consists of a factual claim from PolitiFact and a tweet from Twitter. You will be asked to decide the truthfulness stance of a tweet toward a factual claim. For that, you choose one of the five options: The tweet believes the factual claim is false; The tweet expresses a neutral or no stance toward the factual claim’s truthfulness; The tweet believes the factual claim is true; The tweet and the claim discuss different topicsThe tweet discusses unrelated topic to the claim; The tweet-claim pair is problematic; Skip this pair. The machine learning model can become useful in automating fact-checking. Imagine if a fact checker wants to understand public opinion and reaction toward a factual claim. When a Twitter account posts a tweet that is related to a factual claim, the machine learning model can help identify whether the tweet is believe the factual claim is true or not. You can choose to participate in this research study if you are over the age of 18 and fluent in English. Please note that you are not eligible for this study if you are not an English fluent speaker or are younger than 18. You might want to participate in this study if you would like to get a taste of how technology might help fact-checkers understand the spreading of factual claims and familiarize yourself with the fact-checking process. However, you might not want to participate in this study if you do not have the time or are not interested in the fact-checking process. This study has been reviewed and approved by an Institutional Review Board (IRB). An IRB is an ethics committee that reviews research with the goal of protecting the rights and welfare of human research subjects. Your most important right as a human subject is informed consent. You should take your time to consider the information provided by this form and the research team and ask questions about anything you do not fully understand before making your decision about participating.
TIME COMMITMENTIf you decide to participate in this study, you should know that you have the full freedom to decide the duration of your participation. From a couple of minutes to as many times as you want until the end of the study. For you to get paid, you need to annotate at least 50 initial pairs after your account registration which, based on our estimation, takes about 30 minutes on average.
PROCEDURESIf you decide to participate in this research study, this is the list of activities that we will ask you to perform as part of the research: 1. Go to https://idir.uta.edu/stance_annotation and click on the “Sign Up” button at the top right corner. 2. This will open this consent form that you need to accept if you want to participate in the study. If you decide that you do not wish to participate, click the “Decline” button, or close your web browser. 3. If you click the “Accept” button you will be presented with the account registration form to create an account by providing a username, an email address, and a password. Note that only the first three letters of your username will be visible to other participants in this study. We will use your email address only to communicate with you regarding this study and for your payment information. We will delete your username, email, and password once we have concluded the study. 4. We will send you a verification email. Once you click the verification link in the email, your account will be activated. If you cannot find it please check your junk/spam email box. 5. Using the registered account, you will log into our data annotation website to the instructions page. Read the instructions carefully and continue to the annotation page. On the top right corner of the webpage, there is a link to the instructions page if you need to reread the instructions at any time. 6. On the annotation page, you will be asked to read a pair of sentences (a factual claim and a tweet). Your task is to decide what is the truthfulness stance of the tweet toward the factual claim. For that, you should choose one of the six options: The tweet author believes the factual claim is false; The tweet author expresses a neutral or no stance toward the factual claim’s truthfulness; The tweet author believes the factual claim is true; The tweet and the claim discuss different topics; Skip this pair. The data annotation website will record your choices as well as the timestamp of each interaction with the website. 7. After submitting your choice, a new pair of sentences will appear for annotation. This process will continue until you decide to stop by clicking the “Log Out” button or by closing the web browser. 8. If you want to modify the choice you made on a previous pair, click “Modify My Previous Responses”, where you can see all pairs on which you provided annotations, ordered by the timestamps. After your modification, your work quality may become different. 9. You will be asked to complete 40 training annotations. Upon submitting your answer to a training example, the website will display a message indicating whether the answer is correct or not and provide a justification for the correct answer. After the first 50 annotations, the “Leaderboard” button will be activated. By clicking it, a pop-up window will show your work quality, total compensation, and where you stand among other participants. Only the first three letters of usernames (instead of full names) will be displayed on the leaderboard. The leaderboard calculation is updated every 15-30 pairs reflecting your new work quality and total compensation. To gauge the quality of your choices, we use several “gold standard” sentence pairs, for which we have “correct choices” selected by research experts on the subject. Your choices will be compared with the “gold standard” choices to estimate your work quality level. There is no visual distinction between a pair from the “gold standard” set or a pair outside that set. Below is a list of tips for improving your work quality. You can find a copy of this list under each annotation pair. (1) Carefully examine each pair of factual claim and tweet. (2) Contextual information (such as the fact-check summary, claimant information, hyperlink title and content) may help you form answers. (3) Review the instructions to understand the examples. (4) Don't guess. Skip the pairs that you are not sure about. (5) Modify previous responses if necessary. (6) You may be tempted to pick easy/short claims to work on by clicking "Skip this pair". Keep in mind that our work quality calculation formula has a component that accounts for the length/complexity of claims as well as how many pairs are skipped. We discourage excessive skipping. Nevertheless, if you are not confident about a question, it is still better to skip, because every single mistake will lower your work quality. Whenever you make one mistake, our algorithm lowers your work quality, which means you get less points for every pair you have annotated. It takes multiple correct answers to make up for every single mistake and get the work quality back to the previous value. If your current work quality is 0 or very low, it is because our algorithm detected many mistakes in your answers. The best thing to do is to review your answers and modify them if necessary. If your work quality is 0, it might actually be negative internally. If you continue to answer new questions, it will take MANY questions before you can see positive and improving work quality. If you have labeled 50-150 pairs, the work quality based on the small sample may not reflect your true work quality. It will become more robust once you have labeled more pairs. We may email you about our optional data annotation training workshops in our Lab (ERB - Room 414) or online through MS Teams, which are available to all participants. The purpose of the workshops is to review the information provided on the instructions page, discuss any questions you might have, and annotate some pairs. We expect this activity to be helpful in improving annotation quality. Note that other participants may also be present in the data annotation workshops.
POSSIBLE BENEFITSYou will get a taste of how technology might help fact-checkers understand the spreading of factual claims. AI technologies will be used for processing the tweet and claim text and extracting information from the processed text. We will also use AI technology to model the truthfulness stance detection based on the annotations from subjects. Furthermore, the research outcome might lead to advancement in data-driven fact-checking, which tackles an important sciential challenge.
POSSIBLE RISKS/DISCOMFORTSIf the data is lost or stolen, you may be exposed as a research subject in the study. In addition, there can be a risk of undue influence because a professor figure is recruiting a student. Since there is an authority difference between these two parties, it is possible that you might feel compelled to participate in the research against your best interests. You may worry about your grades, future research opportunities, etc. if they decline. There may be psychological risks because you may be affected by the factual claims’ and tweets’ content. A sizable proportion of factual claims is misinformation. You may potentially misunderstand and consider misinformation as fact. Furthermore, although highly rare given the way we collected tweets for the study, it is possible for a tweet to contain filthy language, profanity, and hate speech. You may feel anger or embarrassment during the annotation. This research study is not expected to pose any additional risks beyond what you would normally experience in your regular everyday life. However, if you experience discomfort, please inform the research team, and quit the study without any consequence to you. To minimize the risk to privacy or confidentiality, we are storing the data on UTA servers and will limit its access to the research team only. To minimize the risk of undue influence we confirm your decision to participate or not participate will not influence your grades or future research opportunities in any way. To minimize the risk of psychological influence, we have removed tweets that contain images, GIFs, or videos.
COMPENSATIONYou will be compensated for every sentence pair you annotate. The compensation will be in the form of gift cards from Amazon that will be purchased by staff in the CSE department and electronically delivered to the participants. The average pay rate per pair ranges from $0.00 to $0.20. The average pay rate is calculated after your initial 50 responses, based on a formula using multiple factors, and on average, it is updated every 15-30 pairs. These factors include the sentences’ length (the longer, the higher pay), the number of skipped pairs (the more skipped, the less pay), and the quality (i.e., correctness) of your choices. You can use the leaderboard to monitor your work quality score and total points, which define your pay rate and total payment, respectively.
ALTERNATIVE OPTIONSThere are no alternative procedures offered for this study. However, you can elect not to participate in the study or quit at any time at no consequence.
CONFIDENTIALITYThe research team is committed to protecting your rights and privacy as a research participant. All paper and electronic data collected from this study will be stored in a secure location on the UTA campus and/or a secure UTA server for at least three (3) years after the end of this research. The results of this study may be published and/or presented without naming you as a participant. The data collected about you for this study may be used for future research studies that are not described in this consent form. If that occurs, an IRB would first evaluate the use of any information that is identifiable to you, and confidentiality protection would be maintained. While absolute confidentiality cannot be guaranteed, the research team will make every effort to protect the confidentiality of your records as described here and to the extent permitted by law. In addition to the research team, the following entities may have access to your records, but only on a need-to-know basis: the U.S. Department of Health and Human Services and the FDA (federal regulating agencies), the reviewing IRB, and sponsors of the study.
CONTACT FOR QUESTIONSQuestions about this research study may be directed to Dr. Chengkai Li at (817) 272-0162 or cli@uta.edu and Zhengyuan Zhu at zhengyuan.zhu@mavs.uta.edu. Any questions you may have about your rights as a research subject or complaints about the research may be directed to the Office of Research Administration; Regulatory Services at 817-272-3723 or regulatoryservices@uta.edu.
CONSENT
By clicking “Accept’’ below, you confirm that you are 18 years of age or older and have read or had this document read to you. You have been informed about this study’s purpose, procedures, possible benefits and risks, and you have received a copy of this form. You have been given the opportunity to ask questions before you click “Accept’’, and you have been told that you can ask other questions at any time.
You voluntarily agree to participate in this study. By clicking “Accept’’, you are not waiving any of your legal rights. Refusal to participate will involve no penalty or loss of benefits to which you are otherwise entitled. You may discontinue participation at any time without penalty or loss of benefits, to which you are otherwise entitled.
UT Arlington
Informed Consent Document
Chengkai Li
Professor
Department of Computer Science and Engineering
(817) 272-0162
cli@uta.edu
From Answering Questions to Questioning Answers (and Questions)---Perturbation Analysis of Database Queries
INTRODUCTIONYou are being asked to participate in a research study about how to detect and counter claims of ``fact’’ which may or may not be true (so-called “lies, d—ed lies, and statistics”) that we often see in news and ads. Your participation is voluntary. Refusal to participate or discontinuing your participation at any time will involve no penalty or loss of benefits to which you are otherwise entitled. Please ask questions if there is anything you do not understand.
PURPOSEWe want to determine whether a sentence contains a factual statement and whether its truthfulness should be checked. You can help us by telling which factual statements in previous Presidential debates are check-worthy and would benefit the voting public. Your responses will help us collect “training data” in developing automatic fact-checking algorithms.
DURATIONParticipants in this study have the full freedom to decide the duration of their participation, from a couple of minutes to as long as they prefer. It is requested that you participate in the study for at least 25 minutes, but you are free to discontinue your participation at any time. We may offer payment for responses for certain time periods. When we do offer payment, we will notify participants. For you to get paid, you need to provide responses to at least 50 initial sentences after your account registration which, based on our estimation, take about 25 minutes on average.
NUMBER OF PARTICIPANTSThe maximum number of anticipated participants in this research study is 1000.
PROCEDURES
The procedures which will involve you as a research participant include:
1. Complete a short registration form with basic information, such as your name, email address. This step should take less than 5 minutes to complete. We will use your name and email only for sharing research results with you and for soliciting participation in future related studies. We will delete your name and email once we have determined that we do not need to follow up with any of our participants.
2. Using the registered account, you will log into our website. On this website, you will be asked to read sentences and determine if each sentence contains a factual statement and if its truthfulness should be checked. The website will record your responses as well as the timestamps of your interactions with the website. The website will not request or access any personal information. Participants in this study have the full freedom to decide the duration of their participation, from a couple of minutes to as long as they prefer. The participants are expected to spend at least 25 minutes.
3. To gauge the quality of your choices, we use several “gold standard” sentences, for which we have “correct choices” selected by research experts on the subject. Your choices will be compared with the “gold standard” choices to estimate your work quality level. There is no visual distinction between a sentence from the “gold standard” set or a sentence outside that set.
At the personal level, a participant may learn to appreciate data and quantitative analysis, and to interpret results critically.
At the society level, the results of this study will benefit research and practice of data-driven fact-checking and decision making, which have a wide range of applications—such as public policy, journalism, urban planning, business intelligence, and health care with benefits to the society.
There are no perceived risks or discomforts for participating in this research study. Should you experience any discomfort please inform the researcher, you have the right to quit any study procedures at any time at no consequence.
COMPENSATIONWe may offer payment for your responses for certain time periods. When we do offer payment, we will notify participants and you will be compensated for every sentence you respond to in the corresponding time periods. The compensation will be in the form of gift cards from Amazon, Visa, or Master that will be purchased by staff in the CSE department and electronically delivered to the participants. The average pay rate per sentence ranges from $0.00 to $0.10. The average pay rate is calculated after your initial 50 responses, based on a formula using multiple factors, and on average, it is updated every 15-30 sentences. These factors include the sentences’ length (the longer, the higher pay), the number of skipped sentences (the more skipped, the less pay), and the quality (i.e., correctness) of your choices. You can use the leaderboard to monitor your total compensation.
ALTERNATIVE PROCEDURESThere are no alternative procedures offered for this study. However, you can elect not to participate in the study or quit at any time at no consequence.
VOLUNTARY PARTICIPATIONParticipation in this research study is voluntary. You have the right to decline participation in any or all study procedures or quit at any time at no consequence.
CONFIDENTIALITYEvery attempt will be made to see that your study results are kept confidential. All data collected from this study will be stored in the Department of Computer Science and Engineering at the University of Texas at Arlington for at least three (3) years after the end of this research. The results of this study may be published and/or presented at meetings without naming you as a participant. Additional research studies could evolve from the information you have provided, but your information will not be linked to you in anyway; it will be anonymous. Although your rights and privacy will be maintained, the Secretary of the Department of Health and Human Services, the UTA Institutional Review Board (IRB), and personnel particular to this research have access to the study records. Your records will be kept completely confidential according to current legal requirements. They will not be revealed unless required by law, or as noted above. The IRB at UTA has reviewed and approved this study and the information within this consent form. If in the unlikely event it becomes necessary for the Institutional Review Board to review your research records, the University of Texas at Arlington will protect the confidentiality of those records to the extent permitted by law.
CONTACT FOR QUESTIONSQuestions about this research study may be directed to Chengkai Li at (817) 272-0162 or cli@uta.edu. Any questions you may have about your rights as a research participant or a research-related injury may be directed to the Office of Research Administration; Regulatory Services at 817-272-2105 or regulatoryservices@uta.edu.
CONSENT
By clicking “Accept’’ below, you confirm that you are 18 years of age or older and have read or had this document read to you. You have been informed about this study’s purpose, procedures, possible benefits and risks, and you have received a copy of this form. You have been given the opportunity to ask questions before you click “Accept’’, and you have been told that you can ask other questions at any time.
You voluntarily agree to participate in this study. By clicking “Accept’’, you are not waiving any of your legal rights. Refusal to participate will involve no penalty or loss of benefits to which you are otherwise entitled. You may discontinue participation at any time without penalty or loss of benefits, to which you are otherwise entitled.