gonp

module
v0.0.0-...-0d087c0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 17, 2025 License: MIT

README ยถ

GoNP - Go NumPy + Pandas

A high-performance numerical computing library for Go, providing NumPy and Pandas-like functionality with Go's type safety and performance characteristics.

Go Version Test Coverage SIMD Optimized Production Ready License Go Report Card

๐Ÿš€ Project Status

GoNP is actively developed with mature core building blocks (arrays, math, stats, series, dataframe) and comprehensive documentation and examples. Hardware acceleration (CUDA/OpenCL) is available via build tags. See the coverage badge for current test coverage and the Makefile for the recommended dev/test workflow.

โœจ Key Features

  • ๐Ÿ”ฅ High Performance: 4.15x faster than naive implementations with SIMD optimization
  • ๐Ÿ›ก๏ธ Type Safety: Compile-time type checking prevents runtime errors
  • โšก Hardware Acceleration: SIMD (AVX2/AVX-512), GPU (CUDA/OpenCL), NUMA, Distributed computing
  • ๐Ÿงฎ Complete Math Suite: 350+ mathematical functions with optimized implementations
  • ๐Ÿ“Š Advanced Analytics: Statistics, Machine Learning, Signal Processing, Bayesian inference
  • ๐Ÿ—ƒ๏ธ Enterprise I/O: CSV, Parquet, SQL with streaming and compression
  • ๐Ÿ”’ Production Grade: Monitoring, security audit, memory management, error handling
  • ๐Ÿ”„ Easy Migration: Direct NumPy/Pandas API equivalents with comprehensive migration guide

๐Ÿ“ฆ Installation

Basic Installation
go get github.com/julianshen/gonp@latest

Or import the modules you need and run go mod tidy.

Hardware Acceleration (Optional)
  • CUDA build (requires CUDA toolchain and CGO):
go build -tags cuda ./...
go test  -tags cuda ./gpu
  • OpenCL build (requires OpenCL headers/runtime and CGO):
go build -tags opencl ./...
go test  -tags opencl ./gpu

You can also use make build-prod for an optimized build. Note: the extra tags in that target are reserved; only cuda and opencl are currently used by the codebase.

Prerequisites
  • Go 1.25+ (required)
  • CUDA 11.0+ (optional, for GPU acceleration)
  • OpenCL 2.0+ (optional, for GPU acceleration)
  • Build tools (optional, for advanced features)

๐Ÿƒ Quick Start

Array Operations (NumPy-like)
import "github.com/julianshen/gonp/array"
import "github.com/julianshen/gonp/math"

// Create arrays
data := []float64{1, 2, 3, 4, 5}
arr, _ := array.FromSlice(data)

// Mathematical operations (SIMD-optimized)
squared := math.Square(arr)        // [1, 4, 9, 16, 25]
sines := math.Sin(arr)             // Element-wise sine
result := math.MatMul(matrix1, matrix2) // Matrix multiplication
Series Operations (Pandas-like)
import "github.com/julianshen/gonp/series"

// Create Series with labels
values := []float64{100, 200, 300}
labels := []interface{}{"A", "B", "C"}
s, _ := series.FromSlice(values, series.NewIndex(labels), "prices")

// Access data
price_a := s.Loc("A")              // 100
filtered := s.Where(func(x interface{}) bool {
    return x.(float64) > 150
})

// Statistical operations
mean := s.Mean()                   // 200
std := s.Std()                     // Standard deviation
DataFrame Operations (Pandas-like)
import "github.com/julianshen/gonp/dataframe"

// Create DataFrame
data := map[string]*array.Array{
    "name":   array.FromSlice([]string{"Alice", "Bob", "Charlie"}),
    "age":    array.FromSlice([]int{25, 30, 35}),
    "salary": array.FromSlice([]float64{50000, 60000, 70000}),
}
df, _ := dataframe.FromMap(data)

// Data operations
summary := df.Describe()           // Statistical summary
high_earners := df.Where("salary", func(x interface{}) bool {
    return x.(float64) > 55000
})

// GroupBy operations
grouped := df.GroupBy("department")
avg_salaries := grouped.Mean()

๐Ÿ“Š Feature Matrix

โœ… Core Foundation (Stable)
  • N-dimensional arrays: SIMD-aware paths with scalar fallback
  • Mathematical functions and linear algebra: broad coverage with numerically stable implementations
  • Statistics: descriptive stats, regression, ANOVA
  • Series & DataFrames: labeled 1D/2D structures with indexing, GroupBy, merge/join
  • I/O: CSV, JSON, Excel, Parquet, SQL
๐ŸŽฏ Advanced Features (Available)
  • SIMD: AVX/AVX2/AVX-512 where available; NEON on arm64 (asm behind neonasm tag)
  • Parallel processing: multi-threaded operations
  • Sparse matrices: COO/CSR/CSC
  • Visualization: matplotlib-style API (rendering WIP)
  • Time series utilities
  • Database integration (SQL) and connection management
  • Memory optimization and pooling

๐Ÿ”ฅ Performance

Representative benchmarks in this repository and docs illustrate SIMD, parallel, and GPU benefits for larger datasets. Results vary by hardware and workload; see make bench, the internal/ tests, and GPU benchmarks under gpu/ for reproducible measurements.

๐Ÿ“š Documentation

Complete API Documentation
Migration and Guides

๐Ÿ—๏ธ Architecture

GoNP Architecture
โ”œโ”€โ”€ Core Foundation
โ”‚   โ”œโ”€โ”€ array/          # N-dimensional arrays (NumPy equivalent)
โ”‚   โ”œโ”€โ”€ series/         # 1D labeled arrays (Pandas Series)  
โ”‚   โ”œโ”€โ”€ dataframe/      # 2D data structures (Pandas DataFrame)
โ”‚   โ””โ”€โ”€ internal/       # Memory management, SIMD, validation
โ”œโ”€โ”€ Mathematical Computing  
โ”‚   โ”œโ”€โ”€ math/           # Universal functions, linear algebra
โ”‚   โ”œโ”€โ”€ stats/          # Statistics, regression, ANOVA
โ”‚   โ”œโ”€โ”€ fft/            # Fast Fourier Transform
โ”‚   โ””โ”€โ”€ random/         # Random number generation
โ”œโ”€โ”€ Data Processing
โ”‚   โ”œโ”€โ”€ io/             # CSV, Parquet, SQL I/O with optimization
โ”‚   โ”œโ”€โ”€ sparse/         # Sparse matrix operations
โ”‚   โ””โ”€โ”€ parallel/       # Multi-threading and parallel processing
โ”œโ”€โ”€ Visualization
โ”‚   โ””โ”€โ”€ visualization/  # Plotting and data visualization
โ””โ”€โ”€ Documentation
    โ”œโ”€โ”€ docs/           # Migration guides and advanced documentation
    โ””โ”€โ”€ examples/       # Comprehensive usage examples

๐Ÿ”„ Migration from Python

GoNP provides direct equivalents for NumPy and Pandas operations:

# NumPy/Pandas (Python)          โ†’  GoNP (Go)
import numpy as np               โ†’  import "github.com/julianshen/gonp/array"
import pandas as pd              โ†’  import "github.com/julianshen/gonp/dataframe"

np.array([1, 2, 3])             โ†’  array.FromSlice([]float64{1, 2, 3})
np.sin(arr)                     โ†’  math.Sin(arr)
np.dot(a, b)                    โ†’  math.Dot(a, b)
pd.DataFrame(data)              โ†’  dataframe.FromMapInterface(data)
df.groupby('col').mean()        โ†’  df.GroupBy("col").Mean()
df.merge(df2, on='key')         โ†’  df.Merge(df2, "key", dataframe.InnerJoin)

See the complete migration guide for detailed conversions.

๐Ÿš€ Getting Started

  1. Install GoNP:

    go get github.com/julianshen/gonp
    
  2. Run Examples:

    cd examples/
    go run basic_operations.go
    go run data_analysis.go
    
  3. Read Documentation:

๐Ÿค Contributing

GoNP follows Test-Driven Development (TDD) with comprehensive tooling:

Development Workflow
# Setup development environment
git clone https://github.com/julianshen/gonp.git
cd gonp
make deps install-tools

# Development commands
make dev           # Complete development workflow
make test-core     # Run core tests
make bench         # Run performance benchmarks
make check         # All quality checks (format, lint, security, test)
make coverage      # Generate test coverage report

# TDD workflow
make tdd           # Red-Green-Refactor cycle

# Production build
make build-prod    # Optimized production build
make deploy-check  # Pre-deployment verification
Code Quality Standards
  • 61.6% test coverage with comprehensive TDD methodology
  • Zero memory leaks detected in production testing
  • Structured error handling with recovery suggestions
  • Security scanning with OWASP compliance checks
  • Performance regression testing for critical paths
Contribution Guidelines
  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Write tests first (TDD Red phase)
  4. Implement functionality (TDD Green phase)
  5. Refactor and optimize (TDD Refactor phase)
  6. Run quality checks (make check)
  7. Commit changes (git commit -m 'Add amazing feature')
  8. Push to branch (git push origin feature/amazing-feature)
  9. Open a Pull Request

See CLAUDE.md for detailed development guidelines and architecture documentation.

๐Ÿ“ˆ Project Overview

The project targets robust, high-performance numerical computing in Go with strong ergonomics and type safety.

Core Systems
  • Arrays, math, stats, and data structures with broad functionality
  • I/O for common formats (CSV, JSON, Excel, Parquet, SQL)
  • Performance optimizations: SIMD where available; optional GPU paths
Quality Metrics
  • Test coverage: 61.6% (see badge and make coverage)
  • Cross-arch tests use -tags vet for determinism; examples are excluded from tests
  • TDD methodology with benchmarks for critical paths
Platforms
  • x86_64 SIMD (AVX/AVX2/AVX-512) and arm64 NEON (pure-Go helpers by default; asm behind neonasm)
  • Optional GPU via cuda or opencl build tags

๐Ÿ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

๐Ÿ™ Acknowledgments

  • NumPy & Pandas Teams: For creating the foundational APIs that inspired this library
  • Go Community: For providing excellent tooling and development practices
  • Contributors: All developers who help improve GoNP

๐Ÿ“ž Support & Community


GoNP: Bringing the power of NumPy and Pandas to Go with native performance and type safety. ๐Ÿš€

Documentation โ€ข Examples โ€ข Migration Guide โ€ข Contributing

Directories ยถ

Path Synopsis
Package array provides n-dimensional array functionality for Go.
Package array provides n-dimensional array functionality for Go.
Package benchmarks provides comprehensive performance benchmarking for GPU vs CPU operations.
Package benchmarks provides comprehensive performance benchmarking for GPU vs CPU operations.
Package dataframe provides a Pandas-like DataFrame data structure for GoNP.
Package dataframe provides a Pandas-like DataFrame data structure for GoNP.
Package examples demonstrates basic GoNP operations
Package examples demonstrates basic GoNP operations
Package gpu provides GPU acceleration interfaces for GoNP.
Package gpu provides GPU acceleration interfaces for GoNP.
Package math provides mathematical functions and operations for GoNP arrays.
Package math provides mathematical functions and operations for GoNP arrays.
Package series provides a Pandas-like Series data structure for GoNP.
Package series provides a Pandas-like Series data structure for GoNP.
Arithmetic operations for sparse matrices
Arithmetic operations for sparse matrices
Package stats provides statistical analysis functions for GoNP arrays and data structures.
Package stats provides statistical analysis functions for GoNP arrays and data structures.
Matplotlib-style plotting API for GoNP
Matplotlib-style plotting API for GoNP

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL