Popularity

7.3

Declining

Activity

0.0

Stable

Stars 1,590

Watchers 38

Forks 54

Last Commit over 1 year ago

Description

Emu is a high-level language for programming GPUs. Unlike other languages such as OpenCL or Halide that are designed for embedding in C or C++, Emu is designed for embedding in Rust. It provides a single procedural macro for writing functions. The macro translates the functions at compile time into lower-level code so that they can be run on the GPU.

Emu also provides several features that aim to make programming GPUs more accessible such as built-in mathematical and physical constants, unit annotation and implicit conversion.

Programming language: Rust

License: MIT License

Tags: Scripting Computation Machine-learning Statistics language Programing Languages Data Processing

Emu alternatives and similar packages

Based on the "Computation" category.
Alternatively, view Emu alternatives based on common mentions on social networks and blogs.

nalgebra

8.9 7.8 Emu VS nalgebra

Linear algebra library for Rust.
nphysics

7.5 0.0 Emu VS nphysics

2 and 3-dimensional rigid body physics engine for Rust.

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

Promo www.influxdata.com

glam

7.3 8.4 Emu VS glam

A simple and fast linear algebra library for games and graphics
cgmath-rs

7.0 0.0 Emu VS cgmath-rs

A linear algebra and mathematics library for computer graphics.
ncollide

6.6 0.0 Emu VS ncollide

2 and 3-dimensional collision detection library in Rust.
arrayfire-rust

6.3 0.0 Emu VS arrayfire-rust

Rust wrapper for ArrayFire
statrs

5.6 0.0 Emu VS statrs

Statistical computation library for Rust
collenchyma

5.4 0.0 Emu VS collenchyma

Extendable HPC-Framework for CUDA, OpenCL and common CPU
rulinalg

4.9 0.0 Emu VS rulinalg

A linear algebra library written in Rust
QuantMath

4.9 0.0 Emu VS QuantMath

Financial maths library for risk-neutral pricing and risk
scirust

4.6 0.0 Emu VS scirust

Scientific Computing Library in Rust
rust-GSL

4.2 6.8 Emu VS rust-GSL

A GSL (the GNU Scientific Library) binding for Rust
rust-opencl

4.1 0.0 Emu VS rust-opencl

OpenCL bindings for Rust.
rust-gmp

3.7 0.0 Emu VS rust-gmp

DISCONTINUED. libgmp bindings
rust-blas

2.9 0.0 Emu VS rust-blas

BLAS bindings for Rust
lapack

2.9 1.8 Emu VS lapack

Wrappers for LAPACK (Fortran)
blas

2.6 0.0 Emu VS blas

Wrappers for BLAS (Fortran)
rustimization

2.1 0.0 Emu VS rustimization

Collection of Optimization algorithm in Rust
lbfgsb-sys

1.2 0.0 Emu VS lbfgsb-sys

Rust binding of fortran Limited memory LBFGS subroutine
cg-sys

0.8 0.0 Emu VS cg-sys

Rust binding of fortran CG+ subroutine

Do you think we are missing an alternative of Emu or a related project?

Add another 'Computation' Package

Popular Comparisons

README

The old version of Emu (which used macros) is here.

Overview

Emu is a GPGPU library for Rust with a focus on portability, modularity, and performance.

It's a CUDA-esque compute-specific abstraction over WebGPU providing specific functionality to make WebGPU feel more like CUDA. Here's a quick run-down of highlight features...

Emu can run anywhere - Emu uses WebGPU to support DirectX, Metal, Vulkan (and also OpenGL and browser eventually) as compile targets. This allows Emu to run on pretty much any user interface including desktop, mobile, and browser. By moving heavy computations to the user's device, you can reduce system latency and improve privacy.
Emu makes compute easier - Emu makes WebGPU feel like CUDA. It does this by providing...
- DeviceBox<T> as a wrapper for data that lives on the GPU (thereby ensuring type-safe data movement)
- DevicePool as a no-config auto-managed pool of devices (similar to CUDA)
- trait Cache - a no-setup-required LRU cache of JITed compute kernels.
Emu is transparent - Emu is a fully transparent abstraction. This means, at any point, you can decide to remove the abstraction and work directly with WebGPU constructs with zero overhead. For example, if you want to mix Emu with WebGPU-based graphics, you can do that with zero overhead. You can also swap out the JIT compiler artifact cache with your own cache, manage the device pool if you wish, and define your own compile-to-SPIR-V compiler that interops with Emu.
Emu is asynchronous - Emu is fully asynchronous. Most API calls will be non-blocking and can be synchronized by calls to DeviceBox::get when data is read back from device.

An example

Here's a quick example of Emu. You can find more in emu_core/examples and most recent documentation here.

First, we just import a bunch of stuff

use emu_glsl::*;
use emu_core::prelude::*;
use zerocopy::*;

We can define types of structures so that they can be safely serialized and deserialized to/from the GPU.

#[repr(C)]
#[derive(AsBytes, FromBytes, Copy, Clone, Default, Debug)]
struct Rectangle {
    x: u32,
    y: u32,
    w: i32,
    h: i32,
}

For this example, we make this entire function async but in reality you will only want small blocks of code to be async (like a bunch of asynchronous memory transfers and computation) and these blocks will be sent off to an executor to execute. You definitely don't want to do something like this where you are blocking (by doing an entire compilation step) in your async code.

fn main() -> Result<(), Box<dyn std::error::Error>> {
    futures::executor::block_on(assert_device_pool_initialized());

    // first, we move a bunch of rectangles to the GPU
    let mut x: DeviceBox<[Rectangle]> = vec![Default::default(); 128].as_device_boxed()?;

    // then we compile some GLSL code using the GlslCompile compiler and
    // the GlobalCache for caching compiler artifacts
    let c = compile::<String, GlslCompile, _, GlobalCache>(
        GlslBuilder::new()
            .set_entry_point_name("main")
            .add_param_mut()
            .set_code_with_glsl(
            r#"
#version 450
layout(local_size_x = 1) in; // our thread block size is 1, that is we only have 1 thread per block

struct Rectangle {
    uint x;
    uint y;
    int w;
    int h;
};

// make sure to use only a single set and keep all your n parameters in n storage buffers in bindings 0 to n-1
// you shouldn't use push constants or anything OTHER than storage buffers for passing stuff into the kernel
// just use buffers with one buffer per binding
layout(set = 0, binding = 0) buffer Rectangles {
    Rectangle[] rectangles;
}; // this is used as both input and output for convenience

Rectangle flip(Rectangle r) {
    r.x = r.x + r.w;
    r.y = r.y + r.h;
    r.w *= -1;
    r.h *= -1;
    return r;
}

// there should be only one entry point and it should be named "main"
// ultimately, Emu has to kind of restrict how you use GLSL because it is compute focused
void main() {
    uint index = gl_GlobalInvocationID.x; // this gives us the index in the x dimension of the thread space
    rectangles[index] = flip(rectangles[index]);
}
            "#,
        )
    )?.finish()?;

    // we spawn 128 threads (really 128 thread blocks)
    unsafe {
        spawn(128).launch(call!(c, &mut x));
    }

    // this is the Future we need to block on to get stuff to happen
    // everything else is non-blocking in the API (except stuff like compilation)
    println!("{:?}", futures::executor::block_on(x.get())?);

    Ok(())
}

And last but certainly not least, we use an executor to execute.

fn main() {
    futures::executor::block_on(do_some_stuff()).expect("failed to do stuff on GPU");
}

Built with Emu

Emu is relatively new but has already been used for GPU acceleration in a variety of projects.

Used in toil for GPU-accelerated linear algebra
Used in ipl3hasher for hash collision finding
Used in bigbang for simulating gravitational acceleration (used older version of Emu)

Getting started

The latest stable version is on Crates.io. To start using Emu, simply add the following line to your Cargo.toml.

[dependencies]
emu_core = "0.1.1"

To understand how to start using Emu, check out the docs. If you have any questions, please ask in the Discord.

Contributing

Feedback, discussion, PRs would all very much be appreciated. Some relatively high-priority, non-API-breaking things that have yet to be implemented are the following in rough order of priority.