Introduction

This is a step-by-step tutorial of await-generator for newcomers with only basic PHP knowledge.

await-generator plays tricks on a PHP feature called "generator". It allows you to write code more easily in a style called "asynchronous".

This tutorial involves concepts from PocketMine-MP, a server software for Minecraft written in PHP. The target audience is plugin developers for PocketMine-MP.

Generators

A PHP function that contains a yield keyword is called a "generator function".

function foo() {
	echo "hi!\n";
	yield;
	echo "working hard.\n";
	yield;
	echo "bye!\n";
}

When you call this function, it does not do anything (it doesn't even echo "hi"). Instead, you get a Generator object, which lets you control the execution of the function.

Let's tell PHP to start running this function:

$generator = foo();
echo "Let's start foo\n";
$generator->rewind();
echo "foo stopped\n";

You will get this output:

Let's start foo
hi!
foo stopped

The function stops when there is a yield statement. We can tell the function to continue running using the Generator object:

$generator->send(null);

And this additional output:

working hard.

Now it stops again at the next yield.

Sending data into/out of the Generator

We can put a value behind the yield keyword to send data to the controller:

function bar() {
	yield 1;
}
$generator = bar();
$generator->rewind();
var_dump($generator->current());
int(1)

Similarly, we can send data back to the function. If you use yield [value] as an expression, it is resolved into the value passed in $generator->send().

function bar() {
	$receive = yield;
	var_dump($receive);
}
$generator = bar();
$generator->rewind();
$generator->send(2);
int(2)

Furthermore, the function can eventually "return" a value. This return value is not handled the same way as a yield; it is obtained using $generator->getReturn(). However, the return type hint must always be Generator no matter what you return, or if you don't return:

function qux(): Generator {
	yield 1;
	return 2;
}

Calling another generator

You can call another generator in a generator, which will pass through all the yielded values and send back all the sent values using the yield from syntax. The yield from expression resolves to the return value of the generator.

function test($value): Generator {
	$send = yield $value;
	return $send;
}

function main(): Generator {
	$a = yield from test(1);
	$b = yield from test(2);
	var_dump($a + $b);
}

$generator = main();
$generator->rewind();
var_dump($generator->current());
$generator->send(3);
var_dump($generator->current());
$generator->send(4);
int(1)
int(2)
int(7)

Hacking generators

Sometimes we want to make a generator function that does not yield at all. In that case, you can write 0 && yield; at the start of the function; this will make your function a generator function, but it will not yield anything. As of PHP 7.4.0, 0 && yield; is a no-op, which means it will not affect your program performance even if you run this line many times.

function emptyGenerator(): Generator {
	0 && yield;
	return 1;
}

$generator = emptyGenerator();
var_dump($generator->next());
var_dump($generator->getReturn());
NULL
int(1)

Asynchronous programming

Traditionally, when you call a function, it performs the required actions and returns after they're done. In asynchronous programming, the program logic may be executed after a function returns.

This leads to two problems. First, the function can't return you with any useful results, because the results are only available after the logic completes. Second, you may do something else assuming the logic is completed, which leads to a bug. For example:

private $data;

function loadData($player) {
	// we will set $this->data[$player] some time later.
}

function main() {
	$this->loadData("SOFe");
	echo $this->data["SOFe"]; // Undefined offset "SOFe"
}

Here, loadData is the function that loads data asynchronously. main is implemented incorrectly, assuming that loadData is synchronous, i.e. it assumes that $this->data["SOFe"] is initialized.

Using callbacks

One of the simplest ways to solve this problem is to use callbacks. The caller can pass a closure to the async function, then the async function will run this closure when it has finished. An example function signature would be like this:

function loadData($player, Closure $callback) {
	// $callback will be called when player data have been loaded.
}

function main() {
	$this->loadData("SOFe", function() {
		echo $this->data["SOFe"]; // this is guaranteed to work now
	});
}

The $callback will be called when some other logic happens. This depends on the implementation of the loadData logic. This may be when a player sends a certain packet, or when a scheduled task gets run, or other scenarios.

More complex callbacks

(This section is deliberately complicated and hard to understand, because the purpose is to tell you that using callbacks is bad.)

What if we want to call multiple async functions one by one? In synchronous code, it would be simple:

$a = a();
$b = b($a);
$c = c($b);
$d = d($c);
var_dump($d);

In async code, we might need to do this (let's say a, b, c, d are async):

a(function($a) {
	b($a, function($b) {
		c($b, function($c) {
			d($c, function($d) {
				var_dump($d);
			});
		});
	});
});

Looks ugly, but readable enough. It might look more confusing if we need to pass $a to $d though.

But what if we want to do if/else? In synchronous code, it looks like this:

$a = a();
if($a !== null) {
	$output = b($a);
} else {
	$output = c() + 1;
}

$d = very_complex_code($output);
$e = that_deals_with($output);
var_dump($d + $e + $a);

In async code, it is much more confusing:

a(function($a) {
	if($a !== null) {
		b($a, function($output) use($a) {
				$d = very_complex_code($output);
				$e = that_deals_with($output);
				var_dump($d + $e + $a);
		});
	} else {
		c(function($output) use($a) {
				$output = $output + 1;
				$d = very_complex_code($output);
				$e = that_deals_with($output);
				var_dump($d + $e + $a);
		});
	}
});

But we don't want to copy-paste the three lines of duplicated code. Maybe we can assign the whole closure to a variable:

a(function($a) {
	$closure = function($output) use($a) {
		$d = very_complex_code($output);
		$e = that_deals_with($output);
		var_dump($d + $e + $a);
	};

	if($a !== null) {
		b($a, $closure);
	} else {
		c(function($output) use($closure) {
			$closure($output + 1);
		});
	}
});

Oh no, this is getting out of control. Think about how complicated this would become when we want to use asynchronous functions in loops!

The await-generator library allows users to write async code in synchronous style. As you might have guessed, the yield keyword is a replacement for callbacks.

Using await-generator

await-generator provides an alternative approach to asynchronous programming. Functions that use async logic are written in generator functions. The main trick is that your function pauses (using yield) when you want to wait for a value, then await-generator resumes your function and sends you the return value from the async function via $generator->send().

Awaiting generators

Since every async function is implemented as a generator function, simply calling it will not have any effects. Instead, you have to yield from the generator.

function a(): Generator {
	// some other async logic here
	return 1;
}

function main(): Generator {
	$a = yield from $this->a();
	var_dump($a);
}

It is easy to forget to yield from the generator.

Handling errors

yield from will throw an exception if the generator function you called threw an exception.

function err(): Generator {
	// some other async logic here
	throw new Exception("Test");
}

function main(): Generator {
	try {
		yield from err();
	} catch(Exception $e) {
		var_dump($e->getMessage()); // string(4) "Test"
	}
}

Using callback-style from generators

Although it is easier to work with generator functions, ultimately, you will need to work with functions that do not use await-generator. In that case, callbacks are easier to use. A callback $resolve can be acquired using Await::promise.

function a(Closure $callback): void {
	// The other function that uses callbacks.
	// Let's assume this function will call $callback("foo") some time later.
}

function main(): Generator {
	return yield from Await::promise(fn($resolve) => a($resolve));
}

Some callback-style async functions may accept another callback for exception handling. This callback can be acquired by taking a second parameter $reject.

function a(Closure $callback, Closure $onError): void {
	// The other function that uses callbacks.
	// Let's assume this function will call $callback("foo") some time later.
}

function main(): Generator {
	return yield from Await::promise(fn($resolve, $reject) => a($resolve, $reject));
}

Example

Let's say we want to make a function that sleeps for 20 server ticks, or throws an exception if the task is cancelled:

use pocketmine\scheduler\Task;

public function sleep(): Generator {
	yield from Await::promise(function($resolve, $reject) {
		$task = new class($resolve, $reject) extends Task {
			private $resolve;
			private $reject;
			public function __construct($resolve, $reject) {
				$this->resolve = $resolve;
				$this->reject = $reject;
			}
			public function onRun(int $tick) {
				($this->resolve)();
			}
			public function onCancel() {
				($this->reject)(new \Exception("Task cancelled"));
			}
		};
		$this->getServer()->getScheduler()->scheduleDelayedTask($task, 20);
	});
}

This is a bit complex indeed, but it gets handy once we have this function defined! Let's see what we can do with a countdown:

function countdown($player) {
	for($i = 10; $i > 0; $i--) {
		$player->sendMessage("$i seconds left");
		yield from $this->sleep();
	}

	$player->sendMessage("Time's up!");
}

This is much simpler than using ClosureTask in a loop!

Exposing a generator to normal API

Recall that generator functions do not do anything when they get called. Eventually, we have to call the generator function from a non-await-generator context. We can use the Await::g2c function for this:

private function generateFunction(): Generator {
	// some async logic
}

Await::g2c($this->generatorFunction());

Sometimes we want to write the generator function as a closure and pass it directly:

Await::f2c(function(): Generator {
	// some async logic
});

You can also use Await::g2c/Await::f2c to schedule a separate async function in the background.

Running generators concurrently

In addition to calling multiple generators sequentially, you can also use Await::all() or Await::race() to run multiple generators.

If you have a JavaScript background, you can think of Generator objects as promises and Await::all() and Await::race() are just Promise.all() and Promise.race().

Await::all()

Await::all() allows you to run an array of generators at the same time. If you yield Await::all($array), your function resumes when all generators in $array have finished executing.

function loadData(string $name): Generator {
	// some async logic
	return strlen($name);
}

$array = [
	"SOFe" => $this->loadData("SOFe"), // don't yield it yet!
	"PEMapModder" => $this->loadData("PEMapModder"),
];
$results = yield from Await::all($array);
var_dump($result);

Output:

array(2) {
  ["SOFe"]=>
  int(4)
  ["PEMapModder"]=>
  int(11)
}

Yielding Await::all() will throw an exception as long as any of the generators throw. The error condition will not wait until all generators return.

Await::race()

Await::race() is like Await::all(), but it resumes as long as any of the generators return or throw. The returned value of yield from is a 2-element array containing the key and the value.

function sleep(int $time): Generator {
	// Let's say this is an await version of `scheduleDelayedTask`
	return $time;
}

function main(): Generator {
	[$k, $v] = yield from Await::race([
		"two" => $this->sleep(2),
		"one" => $this->sleep(1),
	]);
	var_dump($k); // string(3) "one"
	var_dump($v); // int(1)
}

Async iterators

In normal PHP functions, there is only a single return value. If we want to return data progressively, generators should have been used, where the user can iterate on the returned generator. However, if the user intends to perform async operations in every step of progressive data fetching, the next() method needs to be async too. In other languages, this is called "async generator" or "async iterator". However, since await-generator has hijacked the generator syntax, it is not possible to create such structures directly.

Instead, await-generator exposes the Traverser class, which is an extension to the normal await-generator syntax, providing an additional yield mode Traverser::VALUE, which allows an async function to yield async iteration values. A key (the current traversed value) is passed with Traverser::VALUE. The resultant generator is wrapped with the Traverser class, which provides an asynchronous next() method that executes the generator asynchronously and returns the next traversed value,

Example

In normal PHP, we may have an line iterator on a file stream like this:

function lines(string $file) : Generator {
	$fh = fopen($file, "rt");
	try {
		while(($line = fgets($fh)) !== false) {
			yield $line;
		}
	} finally {
		fclose($fh);
	}
}

function count_empty_lines(string $file) {
	$count = 0;
	foreach(lines($file) as $line) {
		if(trim($line) === "") $count++;
	}
	return $count;
}

What if we have async versions of fopen, fgets and fclose and want to reimplement this lines function as async?

We would use the Traverser class instead:

function async_lines(string $file) : Generator {
	$fh = yield from async_fopen($file, "rt");
	try {
		while(true) {
			$line = yield from async_fgets($fh);
			if($line === false) {
				return;
			}
			yield $line => Traverser::VALUE;
		}
	} finally {
		yield from async_fclose($fh);
	}
}

function async_count_empty_lines(string $file) : Generator {
	$count = 0;

	$traverser = new Traverser(async_lines($file));
	while(yield from $traverser->next($line)) {
		if(trim($line) === "") $count++;
	}

	return $count;
}

Interrupting a generator

Yielding inside finally may cause a crash if the generator is not yielded fully. If you perform async operations in the finally block, you must drain the traverser fully. If you don't want the iterator to continue executing, you may use the yield $traverser->interrupt() method, which keeps throwing the first parameter (SOFe\AwaitGenerator\InterruptException by default) into the async iterator until it stops executing. Beware that interrupt may throw an AwaitException if the underlying generator catches exceptions during yield Traverser::VALUEs (hence consuming the interrupts).

It is not necessary to interrupt the traverser if there are no finally blocks containing yield statements.

Versioning concerns

await-generator is guaranteed to be shade-compatible, backward-compatible and partly forward-compatible.

Await-generator uses generator objects for communication. The values passed through generators (such as Await::ONCE) are constant strings that are guaranteed to remain unchanged within a major version. Therefore, multiple shaded versions of await-generator can be used together.

New constants may be added over minor versions. Older versions will crash when they receive constants from newer versions.

Only Await::f2c/Await::g2c loads await-generator code. Functions that merely yield values from the Await class will not affect the execution logic. Therefore, the version of await-generator on which Await::f2c/Await::g2c is called determines the highest version to use.

(For those who do not use virion framework and are confused: await-generator is versioned just like the normal semver for you.)