Skip to content

This repository implements a citation prediction model for research articles using deep learning techniques. It uses a dataset of articles published from 2013 to 2021 in Web of Science. Data preprocessing was done with Python libraries like NLTK, pandas, and spacy. The model achieved 88% accuracy, outperforming existing studies.

Notifications You must be signed in to change notification settings

arslank001/Web-of-Science-Dataset

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 

Repository files navigation

Predicting Top Cited Research Articles Data

This repository presents a deep learning approach to predicting citations of research articles from the Web of Science dataset. The data spans articles published between 2013 and 2021, including their citation counts and various other features.

Features

  • Data Preprocessing:
    • Cleaned and preprocessed using Python libraries such as NLTK, pandas, numpy, matplotlib, and spacy.
  • Feature Engineering:
    • Top 10 features selected through three algorithms: Information Gain, Gini Index, and Gain Ratio.
  • Models Used:
    • Machine learning and deep learning models to predict article citations, achieving 88% accuracy.

Feature Selection

Feature selection was done using the following algorithms:

  • Information Gain
  • Gini Index
  • Gain Ratio

Citation Prediction Models

The following models were implemented:

  • Random Forest
  • Gradient Boosting
  • Deep Neural Networks

Results

The model achieved an accuracy of 88%, outperforming similar studies in the field.

Dataset Availability

The dataset used for this project is available upon request. Please contact the author if you would like access.

Future Work

The dataset and detailed dissertation title will be made available after the related research article is officially published. Stay tuned for updates!

License

This repository is licensed under the MIT License. See the LICENSE file for details.

Author

About

This repository implements a citation prediction model for research articles using deep learning techniques. It uses a dataset of articles published from 2013 to 2021 in Web of Science. Data preprocessing was done with Python libraries like NLTK, pandas, and spacy. The model achieved 88% accuracy, outperforming existing studies.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published