A deep learning framework for unsupervised anomaly detection in time series data using autoencoder architectures.
This repository implements the unsupervised anomaly detection framework presented in:
"Unsupervised anomaly detection of permanent-magnet offshore wind generators through electrical and electromagnetic measurements" Ali Dibaj, Mostafa Valavi, and Amir R. Nejad
Wind Energy Science, 2024
DOI: https://doi.org/10.5194/wes-9-2063-2024
The initial implementation uses a convolutional autoencoder (CAE) model, as shown in Fig. 1, trained on electrical and electromagnetic time series data for anomaly detection in wind turbine permanent-magnet generators. The model is trained on normal operational data and evaluated on various fault conditions. For detailed methodology and results, please refer to the paper (we appreciate citations if you find this work useful for your research).
Fig. 1: Overview of data processing and CAE model architecture implemented in the paper
The dataset used in the original paper was generated using proprietary simulation software for wind turbine permanent-magnet generators and is confidential. It contains electrical and electromagnetic signals under both normal and various fault conditions.
For demonstration and validation purposes, this repository uses the Case Western Reserve University (CWRU) Bearing Vibration Dataset, which is publicly available. The CWRU dataset contains vibration measurements from normal and faulty bearings with different fault types and severities.
Dataset Source: CWRU Bearing Data Center
Hugging Face Version: A processed version is available via Hugging Face Datasets as alidi/cwru-dataset
Note: To access the dataset, you'll need a Hugging Face account and access token:
from huggingface_hub import login login(token="your_huggingface_token")To get your token: Sign up at huggingface.co → Settings → Access Tokens → Create new token
The data pipeline is designed to be modular, allowing users to easily adapt the framework to work with their own time series datasets.
This repository extends the original work by implementing and evaluating additional autoencoder architectures and loss functions for time series anomaly detection.
- Convolutional Autoencoder (Original paper)
- Wavenet-based Autoencoder
- Attention-based Autoencoder
The framework supports multiple loss functions that can be applied in both time and frequency domains:
- MSE, MAE, Huber, Cosine Similarity, KL Divergence
Below are the anomaly detection results for the three implemented models when using time-domain MSE as the loss function. Each plot shows the reconstruction error (anomaly score), with normal data (samples 0-1600) and faulty data (samples 1600-5200) from the CWRU bearing dataset - both drive-end and fan-end measurements and different load conditions.
Fig. 2: Convolutional Autoencoder (CAE) anomaly scores with time-domain MSE loss
Fig. 3: WaveNet Autoencoder anomaly scores with time-domain MSE loss
Fig. 4: Attention Autoencoder anomaly scores with time-domain MSE loss
The plots show how effectively each model distinguishes between normal and faulty bearing conditions. The Attention Autoencoder appears to be the most sensitive to anomalies, likely due to its ability to capture long-range dependencies in the signal.
This project uses Poetry for dependency management. The main dependencies are:
- Python >= 3.10
- PyTorch 2.2+
- Lightning 2.1+
- NumPy
- Librosa (for signal processing)
- Matplotlib (for visualization)
- Hugging Face Datasets
# Install Poetry if you haven't already
curl -sSL https://install.python-poetry.org | python3 -
# Clone the repository
git clone https://github.com/alidibaj/autoencoder-based-anomaly-detection
cd autoencoder-based-anomaly-detection
# Install dependencies using Poetry
poetry install
# Activate the virtual environment
poetry env activate
python train.py
You can customize the model architecture, loss function, training parameters, and more by modifying the configuration file config.py
config["which_model"] = "CAE" # Options: CAE, WavenetAE, AttentionAE
config["loss_fn"] = "mse" # Options: mse, mae, huber, cosine, kl_divergence, shape_factor, combined
config["loss_domain"] = "frequency" # Options: time, frequency
models/
- Autoencoder model definitionssrc/
- Core functionality including data processing and trainingtrain.py
- Main training scriptconfig.py
- Configuration parameters
This project is licensed under the MIT License - see the LICENSE file for details.