A comprehensive C++ library for implementing and learning both deep learning and traditional machine learning algorithms from scratch, featuring modern C++ design patterns, extensive documentation, and automated CI/CD.
[
](LICENSE)
๐ฏ Project Goals
This project provides a structured framework for implementing fundamental deep learning and traditional machine learning algorithms in C++. It's designed for educational purposes and hands-on learning of:
Deep Learning Algorithms:
- Neural network architectures (Feedforward, CNN, RNN, LSTM, GRU)
- Optimization algorithms (SGD, Adam, RMSprop)
- Activation functions (ReLU, Sigmoid, Tanh, Softmax, LeakyReLU)
- Loss functions (MSE, Cross-entropy, Hinge loss)
Traditional Machine Learning Algorithms:
- Dimensionality reduction (Principal Component Analysis)
- Clustering algorithms (K-Means)
- Classification algorithms (Support Vector Machine)
Utilities:
- Mathematical utilities and high-performance matrix operations
- Data processing with comprehensive loading and preprocessing utilities
๐ Key Features
- ๐ Comprehensive Documentation: Full Doxygen documentation with examples and mathematical descriptions
- ๐ง Modern C++: Uses C++23 features and best practices
- ๐งช Tested: Comprehensive test suite with Google Test
- ๐ CI/CD: Automated testing, static analysis, and documentation deployment
- ๐ Performance: Optimized matrix operations and memory management
- ๐ Educational: Detailed comments and learning-focused design
๐ Project Structure
deep-learning-algo-impls/
โโโ include/ # Header files
โ โโโ neural_networks/ # Deep learning architectures
โ โ โโโ feedforward.hpp # Feedforward neural networks
โ โ โโโ cnn.hpp # Convolutional neural networks
โ โ โโโ rnn.hpp # Recurrent neural networks (RNN/LSTM/GRU)
โ โโโ optimization/ # Optimization algorithms
โ โ โโโ optimizers.hpp # SGD, Adam, RMSprop optimizers
โ โโโ activation/ # Activation functions
โ โ โโโ functions.hpp # ReLU, Sigmoid, Tanh, Softmax
โ โโโ loss/ # Loss functions
โ โ โโโ functions.hpp # MSE, Cross-entropy, Hinge loss
โ โโโ ml/ # Traditional ML algorithms
โ โ โโโ ml.hpp # Main ML header (includes all algorithms)
โ โ โโโ pca.hpp # Principal Component Analysis
โ โ โโโ kmeans.hpp # K-Means clustering
โ โ โโโ svm.hpp # Support Vector Machine
โ โโโ utils/ # Utility classes
โ โโโ matrix.hpp # Matrix operations
โ โโโ data_loader.hpp # Data loading and preprocessing
โโโ src/ # Implementation files
โ โโโ neural_networks/ # Deep learning implementations
โ โโโ optimization/ # Optimizer implementations
โ โโโ activation/ # Activation function implementations
โ โโโ loss/ # Loss function implementations
โ โโโ ml/ # Traditional ML implementations
โ โ โโโ pca.cpp # PCA implementation
โ โ โโโ kmeans.cpp # K-Means implementation
โ โ โโโ svm.cpp # SVM implementation
โ โโโ utils/ # Utility implementations
โโโ tests/ # Unit tests
โ โโโ test_feedforward.cpp # Neural network tests
โ โโโ test_matrix.cpp # Matrix operation tests
โ โโโ test_optimizers.cpp # Optimizer tests
โโโ .github/workflows/ # CI/CD pipelines
โ โโโ ci.yml # Automated testing workflow
โโโ CMakeLists.txt # Build configuration
โโโ Doxyfile # Documentation configuration
โโโ main.cpp # Example usage
๐ Documentation
Full API documentation is automatically generated using Doxygen with the modern Doxygen Awesome theme and deployed to GitHub Pages:
๐ View Documentation
The documentation features:
- Modern, clean design with improved readability <mcreference link="https://jothepro.github.io/doxygen-awesome-css/" index="1">1</mcreference>
- Mobile-responsive interface for documentation on any device <mcreference link="https://jothepro.github.io/doxygen-awesome-css/" index="1">1</mcreference>
- Dark mode support for comfortable viewing <mcreference link="https://jothepro.github.io/doxygen-awesome-css/" index="1">1</mcreference>
- Enhanced navigation with sidebar treeview
- Complete API reference with examples
- Mathematical descriptions of algorithms
- Usage patterns and best practices
- Implementation guides and tutorials
๐ Quick Start
Matrix Operations
Matrix<double> a(3, 3, 1.0);
Matrix<double> b = Matrix<double>::random(3, 3);
auto c = a * b;
auto d = a + b;
auto e = a.transpose();
Matrix utility class for deep learning operations.
Principal Component Analysis
#include "utils/pca.hpp"
{2.5, 2.4},
{0.5, 0.7},
{2.2, 2.9},
{1.9, 2.2},
{3.1, 3.0}
});
auto variance_ratio = pca.explained_variance_ratio();
for (size_t i = 0; i < variance_ratio.size(); ++i) {
std::cout << "Component " << i << ": " << variance_ratio[i] << std::endl;
}
MatrixD reduced_data = pca.transform(data, 1);
void fit(const Matrix< T > &data)
Fit the K-Means model to the data.
Neural Network Training
#include "neural_networks/feedforward.hpp"
std::vector<size_t> layers = {784, 128, 64, 10};
neural_networks::FeedforwardNetwork network(layers);
auto [features, labels] = utils::CSVLoader::load_features_labels(
"data.csv", {0, 1, 2, 3}, {4});
network.train(dataset, epochs=100, learning_rate=0.01);
auto predictions = network.predict(test_features);
Data loading and preprocessing utilities for deep learning.
Data Loading and Preprocessing
auto data = CSVLoader::load_csv("dataset.csv");
auto normalized = Preprocessor::normalize(data, 0.0, 1.0);
auto standardized = Preprocessor::standardize(data);
auto [train, val, test] = Preprocessor::train_val_test_split(
dataset, 0.7, 0.15);
while (loader.has_next()) {
auto [batch_features, batch_labels] = loader.next_batch();
}
๐ ๏ธ Prerequisites
- C++23 compatible compiler (GCC 11+, Clang 14+, or MSVC 2022+)
- CMake 3.31 or higher
- Google Test for unit testing
- Git for version control
Installing Dependencies
Ubuntu/Debian
sudo apt-get update
sudo apt-get install -y cmake ninja-build libgtest-dev
sudo apt-get install -y gcc-11 g++-11 # or clang-14
macOS
brew install cmake ninja googletest
Windows (vcpkg)
๐ Building the Project
- Clone the repository
git clone <repository-url>
cd deep-learning-algo-impls
- Configure and build
cmake -B build -DCMAKE_BUILD_TYPE=Release
cmake --build build
- Run tests
cd build
ctest --output-on-failure
- Run the main executable
./build/deep_learning_algo_impls
๐ Implementation Guide
This project provides header files with comprehensive TODO comments and example structures. Each algorithm should be implemented following these guidelines:
1. Neural Networks
- Feedforward Networks: Implement basic multilayer perceptrons with configurable architectures
- CNNs: Add convolution, pooling, and feature extraction layers
- RNNs: Implement sequence processing with LSTM and GRU variants
2. Optimization
- SGD: Basic gradient descent with momentum support
- Adam: Adaptive learning rates with bias correction
- RMSprop: Root mean square propagation
3. Mathematical Utilities
- Matrix Class: Efficient matrix operations for linear algebra
- Activation Functions: Differentiable activation functions
- Loss Functions: Various loss functions for different tasks
4. Data Processing
- Data Loaders: CSV and image data loading utilities
- Preprocessing: Normalization, standardization, and augmentation
๐งช Testing Strategy
The project includes comprehensive unit tests for:
- Matrix operations and mathematical correctness
- Neural network forward/backward propagation
- Optimizer convergence and update rules
- Activation and loss function derivatives
Running Specific Tests
# Run all tests
ctest
# Run specific test suite
./build/run_tests --gtest_filter="MatrixTest.*"
# Run with verbose output
./build/run_tests --gtest_filter="*" --gtest_output="verbose"
๐ Continuous Integration
The project includes GitHub Actions workflows that automatically:
- Build and test on multiple platforms (Ubuntu, macOS)
- Test with different compilers (GCC, Clang)
- Run static analysis and code formatting checks
- Generate documentation (when implemented)
- Perform memory leak detection
๐ Learning Path
Recommended implementation order for learning:
- Start with Matrix utilities - Foundation for all operations
- Implement activation functions - Simple mathematical functions
- Build feedforward networks - Core neural network concepts
- Add optimization algorithms - Learning and convergence
- Implement loss functions - Training objectives
- Extend to CNNs - Spatial data processing
- Add RNNs/LSTMs - Sequential data processing
๐ค Contributing
This is a learning-focused project. Feel free to:
- Implement the TODO items in the headers
- Add comprehensive tests for your implementations
- Improve documentation and examples
- Optimize performance and memory usage
- Add new algorithms and techniques
๐ License
Apache License 2.0
๐ Resources
Happy Learning! ๐