I am a computer science Ph.D. student at The University of Texas at Arlington (UTA). Currently, I am working in IDIR Lab under the supervision of Dr. Chengkai Li. I have interests in research areas related to Big Data and Data Science, including Database and Data Mining. I have published in prestigious venues such as VLDB, CIKM, ICDE and IEEE Transactions on Knowledge and Data Engineering (TKDE). My works have won several awards including Excellent Demonstration Award (VLDB 2014), President’s Award (ACES 2016, Graduate Poster) and Dissertation Fellowship. Before joining the Ph.D. program, I worked as a lecturer in Daffodil International University after completing B.Sc. from Bangladesh University of Engineering and Technology (BUET).
Automated Fact-checking: Politicians and media figures make claims about “facts” all the time. The new army of fact-checkers can often expose claims which are false, exaggerated or half-truths. Technology, social media and new forms of journalism have made it easier than ever to disseminate falsehoods and half-truths faster than the fact-checkers can expose them. This “gap” in time and availability limits the effectiveness of fact-checking. The goal of this project is to pursue towards a completely automatic fact-checking platform, investigate the technical challenges and propose potential solutions [C+J 2015]. We are building ClaimBuster [CIKM 2015], a platform to monitor live streams, websites, and social media to catch factual claims, detect matches with a curated repository of fact-checks, and deliver the matches instantly to viewers. Major components of the platform are- text mining, social media analysis and collaborative fact-checking. This project has received media attention from multiple news outlets, including the guardian, Austin American-Statesman, Poynter and New Scientist.
Significant Fact Monitoring: The goal of this project is to augment journalists identify data-backed, attention-seizing facts which serve as leads to news stories. Examples of such facts are- “This month the Chinese capital has experienced 10 days with a maximum temperature in around 35 degrees Celsius—the most for the month of July in a decade”, “Michael Jordan had 53 points in the Chicago Bulls' win over the Detroit Pistons. No one before had a better or equal performance in 1995-96 season”. Given an append-only database, upon the arrival of a new tuple, the challenge is to design algorithms which efficiently search for facts without exhaustively testing all possible ones [ICDE 2014, C+J 2014]. We developed FactWatcher [VLDB 2014], a system which finds story leads from ever-growing data and provides features including fact ranking, fact-to-statement translation, and keyword-based fact search. This system won an Excellent Demonstration Award in VLDB 2014.
Skyline Group: Traditional Pareto frontier (skyline) computation is inadequate to answer queries which need to analyze not only individual points but also groups of points. To approach this gap, we proposed a novel concept “Skyline Group” [TKDE 2014, CIKM 2012] that represents groups which are not dominated by any other groups. We demonstrated its applications through a web-based system CrewScout [CIKM 2014] in question answering, expert team formation and paper reviewer selection. An attractive characteristic of a skyline team is that no other team of equal size can dominate it. In contrast, given a non-skyline team, there is always a better skyline team. This property distinguishes CrewScout from other team recommendation techniques.
Crowdsourcing Pareto-optimal Objects: Finding Pareto-optimal objects through crowdsourcing has applications in public opinion collection, group decision making, and information exploration. Departing from prior studies on crowdsourcing skyline and ranking queries, it considers the case where objects do not have explicit attributes and preference relations on objects are strict partial orders. The partial orders are derived by aggregating crowdsourcers’ responses to pairwise comparison questions. The goal is to find all Pareto-optimal objects by the fewest possible questions [CIKM 2015].
|University of Texas at Arlington|
|CSE 4334/5334: Data Mining||FALL 2014|
|CSE 6324: Advanced Topics in Software Engineering||SPRING 2014|
|CSE 5311: Design and Analysis of Algorithms||FALL 2013, FALL 2011|
|CSE 3330: Database Systems and File Structures||SPRING 2012, SUMMER 2011, SPRING 2011, FALL 2010|
|CSE 1310: Introduction to Computers and Programming||SPRING 2011|
|Daffodil International University|
|Computer Fundamentals, Numerical Methods, Instrumentation and Control, Electrical Circuit, Compiler, Simulation and Modeling, VLSI.|
|Mentor (University of Texas at Arlington)|
|Current M.S. students||Vikas Sable, Siddhant Gawsane, Pratik Palashikar, Abu Ayub Ansari|
|Current B.S. students||Josue Caraballo|
|Graduated M.S. students||Fatma Dogan (December 2015), Minumol Joseph (December 2015)|
|Graduated B.S. students||Huadong Feng (May 2014)|