Professor

Personal Information

  • Business Address: 西南交通大学犀浦校区9教办公室
  • Alma Mater: 德国,达姆斯塔特工大
  • School/Department: 计算机与人工智能学院
  • Discipline:Software Engineering
    Computer Application Technology
    Computer Science and Technology
  • VIEW MORE

    Other Contact Information:

    Other :

    Email :


    Home > Research > Research topics

    Research interests

    Web data mining; Privacy-preserving data mining; Social Networking Analysis; Big data management and intelligent analysis (NewSQL, NoSQL)

    Projects


    Project Description
    Web spam detection

    Web spam attempts to influence search engine ranking algorithm in order to boost the rankings of specific web pages in search engine results.10%-15% Web pages were  deliberately contaminated by hidden links and objectionable content. Various spamming tricks including advertisement or objectionable content injection, hidden links attack,cloaking and redirection, etc. 


    The objectives of Web spamming are to gain more benefits or to attack. This project will tackle the various challengeable problems of Web spam by modeling junk Web pages, extracting the spam features, analyzing the spammed  content and URL,designing and improving the malicious Web page detection approaches.


    Privacy-preserving data mining

    In 1998, Ann Cavoukian posed a very serious question in her paper "Data mining: Staking a claim on your privacy", i.e., data mining may be the biggest challenge that individual privacy protectors will face in the next decade. At present, the research and application of big data are in full swing. How to discover the value of big data without disclosing users' privacy is a key issue in the big data mining research area.

    Research topics in this field (PPDM: Privacy-Preserving Data Mining) include:

    • To analyze the individual cases of privacy issues involved in data mining from the aspect of society and law;

    • To study new approaches or improve the existing mining algorithms by integrating data security strategies(encryption, hiding, etc.), which protect sensitive information as more as possible.


    The researches can be conducted with combined distributed technology and data mining on the various aspects such as data publish, mining algorithm, and mining rule release. Most of PPDM methods protect data at the cost of a decline of information usability and mining accuracy. The purpose of these approaches is to find a trade-off among accuracy and privacy.


    Web Fraud Mining

    As Web information and applications are becoming increasingly rich and wide, lots of fraud onslaughts attack rampantly. New fraud and spam types appear, such as social networking fraud, multimedia spam, click fraud/spam, which main purpose is still to profit illegally from the cheating.

    The research is currently conducted on several objects: Twitter spammer discovery; Multimedia spam detection; Comment spam mining; Click fraud detection. Our focuses include spamming tricks and mechanism investigation, discriminative features extraction, high-performance detection algorithms development, etc.


    Web source quality assessment mechanism

    The extremely rich Web resources make the information acquisition and decision making very much easily. However,the Web source  quality is very problematic due to the peculiar characteristics of the Web,  such as, dynamics and autonomy of Web sources, enormous amount and various  types of Web data, multifarious quality requirements of Web applications, etc. These result in uneven and uncertain information quality and inferior Web-based planning and strategy making. With the popularization of Wiki sites, the Web source quality becomes increasing challenge.

    In this multistage project, we have proposed a Web quality model - WebQM for capturing the Web quality features from 3 dimensions.The feasibility and effectiveness of WebQM has been verified by SEM with actually observed data.We have developed the evaluation approaches under fuzzy environments based on WebQM and implemented a prototype of Web quality fuzzy assessment system,where the sensitivity analysis of the evaluation approach  was carried out. Our current work is to model the quality problem of Wiki sources and modify WebQM and the evaluation mechanism for assessing the content quality of Wiki sites.


    Product digitized design and manufacture services based on Internet+

    With the implementation of China's 2025 manufacturing plan, Internet plus technology has led to the deeply integration of industry and information. The conventional  patterns and methods for product design and manufacture will be changed. It is necessary and very important to study and develope new model, tools and platforms for adapting such a transformation.

    This project was carried out as follows:

    • Based on mobile Internet, a product design mode with crowd-creating is investigated, and a product crowd-design platform will be built in mobile phone, notebook computer, and personal computer, etc.;

    • A crowd-innovation service platform is established with the combination of virtual design technology and cloud computing;

    • A typical product design and development process as an examples of the digitized design pattern and service will be excuted on the platform for the demonstration.


    High-speed rail big data management system

    This project is key part of the digital simulation platform of high-speed rail. A lot of problems must be dealt with for building the platform, which includes heterogeneous and multi-structured high-speed rail data,a huge amount of data exchange among subsystems,the efficiency and system-independency of data access. This project will develop a data management system for solving the issues above and for supporting the branch- and coupling-simulation,as well as supporting the multi-dimensional and multi-level visualization.Specific approaches are designed for multi-source data ETL, loose-coupling data management,and multi-branch data access and fusion.