avatar

Gensheng Zhang


Arlington TX, USA

Research Interests

  • Databases
  • Data Mining, Graph Mining
  • Computational Journalism
  • Crowdsourcing

Skills

Python
C++
Java/J2EE
HTML/CSS/JavaScript/PHP
Oracle/MySQL
Linux Unix SVN/CVS/Git ...

About me

I am a CSc Ph.D. student at UTA. I started this program in Sept. 2012. I received an M.Sc. degree in CSc from SDSU in 2012. Prior to that, I had worked as a software engineer at Revenco for a few years, right after I obtained a B.E. degree in Information Security from Wuhan University in 2006.

Currently, I am working in the Innovative Database and Information Systems Research Lab, under supervision of Dr. Chengkai Li.

Education

The Univeristy of Texas at Arlington

South Dakota State University

2010 - 2012 M.Sc. in Computer Science

Wuhan University

2002 - 2006 B.E. in Information Security

Research experiences

The Univeristy of Texas at Arlington

2013 - Present Knowledge Graph Completion

The project strives to complete knowledge graph by soliciting knowledge from the crowd. Currently we focus on missing knowledge detection by leveraging both crowd intelligence and artificial intelligence. Techniques of CrowdSourcing, Collaborative Filtering, and Active Learning are applied to achieve our goal.

2012 - 2013 Prominent Streak Discovery

The project studies the problem of discovery of long consecutive subsequence consisting of only large (small) values in sequence data, e.g., consecutive games of outstanding performance in sports, consecutive hours of heavy network traffic, and so on. The outcome of this project provides insightful data patterns for data analysis in many real-world applications and is an enabling technique for computational journalism.

South Dakota State University

2010 - 2012 Medical Image Processing

The project helps to detect breast cancer in early stage, which is the most important stage that can reduce the mortality significantly. We classify breast masses detected in mammograms to tell malignance of the masses. Various image process techniques are applied, for example, Segmentation, Smoothing, Enhancement, and Contour Analysis, etc.

Work experiences

Google Inc.

Jun. 2015 - Aug. 2015 Software Engineer (Intern)

  1. Work with Spandex team - SQL support for Spanner
  2. Developed "Query Reducer": a tool that reduces a complex and lengthy query to its minimal form while retains the issues exhibited in the original query, e.g. reproduces a bug exposed by the query.

NEC Laboratories America, Inc.

Dec. 2014 - Present Research Assistant (Intern)

  1. Big data storage improvement - investigating how to store/retrieve enterprise security information efficiently to enable various security applications
  2. Insider intrusion detection and defense -- defending the enterprise system in-depth
  3. Incident diagnosis and recovery -- providing root cause analysis of security / performance incidents

The Univeristy of Texas at Arlington

2012 - Present GTA

Teaching Assistants of Data Mining, Databases, Intermediate Programming, and other classes.

South Dakota State University

2010 - 2012 GTA, GRA

Teaching Assistant of Software Enignieering, Algorithms, and other classes

Revenco Group

2006 - 2010 Software Engineer
Software Engineer Maywide Tech., Guangzhou China 2008 - 2010
  • Team Lead of 6 Members
  • Elicit and Analyze customer requirements
  • Design database conceptual/physical model
  • Use C++/Python to implement service billing functionalities
Software Engineer Sunrise Corp., Guangzhou China 2006 - 2008
  • Employer of the year (2007, 2008)
  • Use C++/Java to implement functionalities for 3 business supporting systems, which serve more than 20 million users.

Publications

Data In, Fact Out: Automated Monitoring of Facts by Factwatcher

N. Hassan, A. Sultana, Y. Wu, G. Zhang#, C. Li, J. Yang, C. Yu.
VLDB'14 The Excellent Demonstration Award

#Contribution: Demonstrates one of the three fact types: Prominent Streak Facts.

Finding, Monitoring, and Checking Claims Computationally Based on Structured Data

The iCheck/uClaim Team (Duke University, University of Texas at Arlington, Google Research)
Computation+Journalism Symposium 2014

Crowdsourcing Pareto-Optimal Object Finding by Pairwise Comparisons

A. Asudeh, G. Zhang, N. Hassan, C. Li, G. Zaruba
arXiv'14 Technical Report

Discovering General Prominent Streaks in Sequence Data

G. Zhang, X. Jiang, P. Luo, M. Wang, C. Li.
TKDD'14

A Review of Breast Tissue Classification in Mammograms

G. Zhang, W. Wang, J. Moon, J. Pack, S. Jeon.
RACS'11