Trending

See what the GitHub community is most excited about today.

NVIDIA / cub

[ARCHIVED] Cooperative primitives for CUDA C++. See https://github.com/NVIDIA/cccl

Cuda 1,687 449 Built by

0 stars today

NVlabs / instant-ngp

Instant neural graphics primitives: lightning fast NeRF and more

Cuda 16,095 1,942 Built by

6 stars today

HazyResearch / ThunderKittens

Tile primitives for speedy kernels

Cuda 1,730 77 Built by

4 stars today

flashinfer-ai / flashinfer

FlashInfer: Kernel Library for LLM Serving

Cuda 1,523 151 Built by

1 star today

rapidsai / raft

RAFT contains fundamental widely-used algorithms and primitives for machine learning and information retrieval. The algorithms are CUDA-accelerated and form building blocks for more easily writing high performance applications.

Cuda 798 196 Built by

1 star today

HigherOrderCO / HVM

A massively parallel, optimal functional runtime in Rust

Cuda 10,554 408 Built by

2 stars today

rapidsai / cuvs

cuVS - a library for vector search and clustering on the GPU

Cuda 238 66 Built by

1 star today

rapidsai / cugraph

cuGraph - RAPIDS Graph Analytics Library

Cuda 1,775 304 Built by

1 star today

Dao-AILab / causal-conv1d

Causal depthwise conv1d in CUDA, with a PyTorch interface

Cuda 346 66 Built by

0 stars today

NVIDIA / nvbench

CUDA Kernel Benchmarking Library

Cuda 529 66 Built by

1 star today

karpathy / llm.c

LLM training in simple, raw C/CUDA

Cuda 24,666 2,795 Built by

6 stars today

leoxiaobin / deep-high-resolution-net.pytorch

The project is an official implementation of our CVPR2019 paper "Deep High-Resolution Representation Learning for Human Pose Estimation"

Cuda 4,344 917 Built by

1 star today

SHI-Labs / NATTEN

Neighborhood Attention Extension. Bringing attention to a neighborhood near you!

Cuda 381 31 Built by

1 star today