How Bend Works: A Parallel Programming Language That "Feels Like Python but Scales Like CUDA&#

Author:Murphy  |  View: 25917  |  Time: 2025-03-22 21:03:49
Image by author

Introduction

If you are reading this article, you probably recently heard about Bend, a new programming language that aims to be massively parallel but without you worrying about things like threads creation, and other common parallel programming terms.

If you do not know what I am talking about, watch the video below:

They claim "it feels like Python, but scales like Cuda". As an enthusiast of parallel programming, it caught my attention immediately. After some reading, I found that Bend is powered by HVM (Higher-Order Virtual Machine), the runtime where the magic happens. That is, in a Bend program, the Bend code is compiled into HVM, which does some magic to run this program in an inherently parallel manner. In some way, all operations that can be parallelized are automatically parallelized by this runtime.

Straight away, I wanted to learn how all of the HVM magic happens. How can all of this be possible? After some reading, I learned that the magic behind HVM is mostly based on Interaction Combinators, which is a model of computation based on graphs and graphical rules developed by Yves Lafont in the 1990s. So, I opened the Lafont paper, rolled some pages and saw this:

???? Interaction Combinators alien code. Image by author, inspired from Lafont, 1997

I felt like I was in that movie Arrival, where the aliens try to communicate with us using a strange symbolic language.

That's when I closed the laptop and gave up on trying to understand that.

A while later, when I turned on my machine again, those symbols were there, staring at me, as if they were asking me to be understood.

After a lot of reading, watching videos and alien help, I somehow started to understand this thing.

The purpose of this article is to briefly clarify how all the HVM magic happens and facilitate your further understanding by explaining some common terms you might find during your learning journey. In order to do that, we need to first learn some basic concepts.

λ-Calculus (Lambda Calculus)

The Lambda Calculus is a formal system in mathematical logic created by Alonzo Church in the 1930s. Its purpose was to investigate some aspects of logic theory from a purely mathematical point of view. Church was aiming to define what is computability in mathematical terms (what can be calculated using a set of fundamental rules). Let's start:

You probably have already used Lambda Calculus before. For example, imagine a function to multiply a number by two:

f(x) = 2 * x

On Python, you can express a named function for that like this:

def multiply_by_two(x):
    return 2 * x

print(multiply_by_two(2))
# 4

But you can also express that using lambdas, which are basically an anonymous function:

multiply_by_two_lambda = lambda x: x * 2

print(multiply_by_two_lambda(2))
# 4

So, let's go back to mathematics. In Lambda Calculus, you express this same function using the notation λx.2x, where x is the the parameter and 2x the body.

λ.

This is called an abstraction. An abstraction λx.t denotes an anonymous function that takes a single input variable x and returns t. For example, λx.(x²+2x) is an abstraction representing the function f defined by f(x) = x²+2x. So, an abstraction basically defines a function but does not invoke it.

You can also have a term like λx.(x+y), which is the definition of f(x) = x+y. Here, y has not been defined yet. The expression λx.(x+y) is a valid abstraction and represents a function that adds its input to the yet-unknown y.

If using λx.2x defines a function, _(λx.2x)a "_calls" a function with argument "a". That is, we basically substitute the variable "x" with "a".

f(x) = 2x

f(2) = 4

This is the same as:

λx.2x

(λx.2x)2 = 4

This is called an application. We are "applying" the abstraction (λx.2x) to the number 2.

You can also apply a lambda expression to another lambda expression, such as nested functions:

Take f(x) = 2x and g(x) = x³

And you want g(f(x)):

You can express this using lambda expressions:

λx.2x

λx.x³

=> (λx.x³)(λx.2x)

Do not try to solve it for now, first understand the notation, and further I will show you how to solve it!

It's important to not confuse the parenthesis. For example:

1 – λx.((λx.x)x) is an abstraction (function definition).

2 – (λx.(λx.x))x is an application (funtion application).

On the Example 1, we are defining a function λx.B, where B is the expression (λx.x)x, which is the anonymous function λx.x applied to the input x.

On Example 2, we are applying the anonymous function λx.(λx.x) to the input x.

Applications can also be represented as f x (applying function f to the variable x).

We can also represent functions with n parameters using Lambda Calculus. This can be done by using nested functions, each taking a single parameter: _f(x,y,z) → λx._λy.λz

Thus, f(x, y, z) = x + y + z can be expressed by the abstraction:

_λx._λy.λz.(x + y + z).

Using this abstraction we can construct applications:

_(λx._λy.λz.(x + y + z))1 2 3 => 6

When studying Lambda Calculus, there are also two common terms you might find:

Alpha conversion (α-conversion) and Beta reduction (β-reduction)

Alpha conversion

When evaluating more complex lambda expressions, you may obtain some expression like this:

(λx.(λx.x+x)x)

In this expression, the inner x could be mistakenly interpreted as the outer x. In order to avoid that, we can rename the inner variable x:

(λx.(λy.y+y)x)

This process is what it is called α-conversion, the name seems something complex, but it is this simple renaming of a variable to avoid mistakes.

λx.x → λy.y (α-conversion)

Both expressions represents the same function. The α-conversion does not change the function's behavior, just the variable name.

Beta reduction

β-reduction is simply the process of calculating the result from an application of a function to an expression. For instance: (λx.xy)z

On the output xy, substitute every occurrence of x by z

= zy

You also might found the following notation:

(λ param . output)input => output [param := input] => result

This basically means that to get the result, you look at the output and substitute every occurrence of param by the input. In the previous expression:

(λx.xy)z => (xy)[x := z] => zy

Example

Going back to our example f(x) = 2x; g(x) = x³ and we want g(f(1)).

In order to not mix up terms incorrectly, we can rewrite:

f(x) = 2x and g(y) = y³

Then, we substitute f within g:

g(f(1)) = (f(1))³

=> g(f(1)) = (2*1)³

=> 8*x = 8.

Now with Lambda Calculus:

λx.2x

λx.x³

=> (λx.x³)((λx.2x)1)

First, apply α-conversion in order to not mix up things:

(λy.y³)((λx.2x)1)

Then, β-reduction on the inner most expression (λx.2x)1:

(λ param . output)input => output [param := input] => result

(λx.2x)1 => 2x [x := 1] => 2*1 = 2.

Then, β-reduction again on the resulting expression (λy.y³)2:

(λ param . output)input => output [param := input] => result

(λy.y³)2 => y³[y := 2] => 2³ => 8.

We got the same result! That's brilliant right?


⚠️ If you're starting to feel confused at this point, please don't close the article!! I understand it can be challenging at first, but I promise you, when you sleep today, you will wake up with things more clear! So, keep reading and enjoy the rest of the article

Tags: Bend Cuda Deep Dives Gpu Parallel Computing

Comment