Introduction
This is a step-by-step tutorial of await-generator for newcomers with only basic PHP knowledge.
await-generator plays tricks on a PHP feature called "generator". It allows you to write code more easily in a style called "asynchronous".
This tutorial involves concepts from PocketMine-MP, a server software for Minecraft written in PHP. The target audience is plugin developers for PocketMine-MP.
Generators
A PHP function that contains a yield
keyword is called a "generator function".
function foo() {
echo "hi!\n";
yield;
echo "working hard.\n";
yield;
echo "bye!\n";
}
When you call this function, it does not do anything
(it doesn't even echo "hi").
Instead, you get a Generator
object,
which lets you control the execution of the function.
Let's tell PHP to start running this function:
$generator = foo();
echo "Let's start foo\n";
$generator->rewind();
echo "foo stopped\n";
You will get this output:
Let's start foo
hi!
foo stopped
The function stops when there is a yield
statement.
We can tell the function to continue running using the Generator
object:
$generator->send(null);
And this additional output:
working hard.
Now it stops again at the next yield
.
Sending data into/out of the Generator
We can put a value behind the yield
keyword to send data to the controller:
function bar() {
yield 1;
}
$generator = bar();
$generator->rewind();
var_dump($generator->current());
int(1)
Similarly, we can send data back to the function.
If you use yield [value]
as an expression,
it is resolved into the value passed in $generator->send()
.
function bar() {
$receive = yield;
var_dump($receive);
}
$generator = bar();
$generator->rewind();
$generator->send(2);
int(2)
Furthermore, the function can eventually "return" a value.
This return value is not handled the same way as a yield
;
it is obtained using $generator->getReturn()
.
However, the return type hint must always be Generator
no matter what you return, or if you don't return:
function qux(): Generator {
yield 1;
return 2;
}
Calling another generator
You can call another generator in a generator,
which will pass through all the yielded values
and send back all the sent values
using the yield from
syntax.
The yield from
expression resolves to the return value of the generator.
function test($value): Generator {
$send = yield $value;
return $send;
}
function main(): Generator {
$a = yield from test(1);
$b = yield from test(2);
var_dump($a + $b);
}
$generator = main();
$generator->rewind();
var_dump($generator->current());
$generator->send(3);
var_dump($generator->current());
$generator->send(4);
int(1)
int(2)
int(7)
Hacking generators
Sometimes we want to make a generator function that does not yield at all.
In that case, you can write 0 && yield;
at the start of the function;
this will make your function a generator function, but it will not yield anything.
As of PHP 7.4.0, 0 && yield;
is a no-op,
which means it will not affect your program performance
even if you run this line many times.
function emptyGenerator(): Generator {
0 && yield;
return 1;
}
$generator = emptyGenerator();
var_dump($generator->next());
var_dump($generator->getReturn());
NULL
int(1)
Asynchronous programming
Traditionally, when you call a function, it performs the required actions and returns after they're done. In asynchronous programming, the program logic may be executed after a function returns.
This leads to two problems. First, the function can't return you with any useful results, because the results are only available after the logic completes. Second, you may do something else assuming the logic is completed, which leads to a bug. For example:
private $data;
function loadData($player) {
// we will set $this->data[$player] some time later.
}
function main() {
$this->loadData("SOFe");
echo $this->data["SOFe"]; // Undefined offset "SOFe"
}
Here, loadData
is the function that loads data asynchronously.
main
is implemented incorrectly, assuming that loadData
is synchronous,
i.e. it assumes that $this->data["SOFe"]
is initialized.
Using callbacks
One of the simplest ways to solve this problem is to use callbacks. The caller can pass a closure to the async function, then the async function will run this closure when it has finished. An example function signature would be like this:
function loadData($player, Closure $callback) {
// $callback will be called when player data have been loaded.
}
function main() {
$this->loadData("SOFe", function() {
echo $this->data["SOFe"]; // this is guaranteed to work now
});
}
The $callback
will be called when some other logic happens.
This depends on the implementation of the loadData
logic.
This may be when a player sends a certain packet,
or when a scheduled task gets run,
or other scenarios.
More complex callbacks
(This section is deliberately complicated and hard to understand, because the purpose is to tell you that using callbacks is bad.)
What if we want to call multiple async functions one by one? In synchronous code, it would be simple:
$a = a();
$b = b($a);
$c = c($b);
$d = d($c);
var_dump($d);
In async code, we might need to do this (let's say a
, b
, c
, d
are async):
a(function($a) {
b($a, function($b) {
c($b, function($c) {
d($c, function($d) {
var_dump($d);
});
});
});
});
Looks ugly, but readable enough.
It might look more confusing if we need to pass $a
to $d
though.
But what if we want to do if/else? In synchronous code, it looks like this:
$a = a();
if($a !== null) {
$output = b($a);
} else {
$output = c() + 1;
}
$d = very_complex_code($output);
$e = that_deals_with($output);
var_dump($d + $e + $a);
In async code, it is much more confusing:
a(function($a) {
if($a !== null) {
b($a, function($output) use($a) {
$d = very_complex_code($output);
$e = that_deals_with($output);
var_dump($d + $e + $a);
});
} else {
c(function($output) use($a) {
$output = $output + 1;
$d = very_complex_code($output);
$e = that_deals_with($output);
var_dump($d + $e + $a);
});
}
});
But we don't want to copy-paste the three lines of duplicated code. Maybe we can assign the whole closure to a variable:
a(function($a) {
$closure = function($output) use($a) {
$d = very_complex_code($output);
$e = that_deals_with($output);
var_dump($d + $e + $a);
};
if($a !== null) {
b($a, $closure);
} else {
c(function($output) use($closure) {
$closure($output + 1);
});
}
});
Oh no, this is getting out of control. Think about how complicated this would become when we want to use asynchronous functions in loops!
The await-generator library allows users to write async code in synchronous style.
As you might have guessed, the yield
keyword is a replacement for callbacks.
Using await-generator
await-generator provides an alternative approach to asynchronous programming.
Functions that use async logic are written in generator functions.
The main trick is that your function pauses (using yield
)
when you want to wait for a value,
then await-generator resumes your function and
sends you the return value from the async function via $generator->send()
.
Awaiting generators
Since every async function is implemented as a generator function,
simply calling it will not have any effects.
Instead, you have to yield from
the generator.
function a(): Generator {
// some other async logic here
return 1;
}
function main(): Generator {
$a = yield from $this->a();
var_dump($a);
}
It is easy to forget to yield from
the generator.
Handling errors
yield from
will throw an exception
if the generator function you called threw an exception.
function err(): Generator {
// some other async logic here
throw new Exception("Test");
}
function main(): Generator {
try {
yield from err();
} catch(Exception $e) {
var_dump($e->getMessage()); // string(4) "Test"
}
}
Using callback-style from generators
Although it is easier to work with generator functions,
ultimately, you will need to work with functions that do not use await-generator.
In that case, callbacks are easier to use.
A callback $resolve
can be acquired using Await::promise
.
function a(Closure $callback): void {
// The other function that uses callbacks.
// Let's assume this function will call $callback("foo") some time later.
}
function main(): Generator {
return yield from Await::promise(fn($resolve) => a($resolve));
}
Some callback-style async functions may accept another callback for exception handling. This callback can be acquired by taking a second parameter $reject
.
function a(Closure $callback, Closure $onError): void {
// The other function that uses callbacks.
// Let's assume this function will call $callback("foo") some time later.
}
function main(): Generator {
return yield from Await::promise(fn($resolve, $reject) => a($resolve, $reject));
}
Example
Let's say we want to make a function that sleeps for 20 server ticks, or throws an exception if the task is cancelled:
use pocketmine\scheduler\Task;
public function sleep(): Generator {
yield from Await::promise(function($resolve, $reject) {
$task = new class($resolve, $reject) extends Task {
private $resolve;
private $reject;
public function __construct($resolve, $reject) {
$this->resolve = $resolve;
$this->reject = $reject;
}
public function onRun(int $tick) {
($this->resolve)();
}
public function onCancel() {
($this->reject)(new \Exception("Task cancelled"));
}
};
$this->getServer()->getScheduler()->scheduleDelayedTask($task, 20);
});
}
This is a bit complex indeed, but it gets handy once we have this function defined! Let's see what we can do with a countdown:
function countdown($player) {
for($i = 10; $i > 0; $i--) {
$player->sendMessage("$i seconds left");
yield from $this->sleep();
}
$player->sendMessage("Time's up!");
}
This is much simpler than using ClosureTask
in a loop!
Exposing a generator to normal API
Recall that generator functions do not do anything when they get called.
Eventually, we have to call the generator function from a non-await-generator context.
We can use the Await::g2c
function for this:
private function generateFunction(): Generator {
// some async logic
}
Await::g2c($this->generatorFunction());
Sometimes we want to write the generator function as a closure and pass it directly:
Await::f2c(function(): Generator {
// some async logic
});
You can also use Await::g2c
/Await::f2c
to schedule a separate async function in the background.
Running generators concurrently
In addition to calling multiple generators sequentially,
you can also use Await::all()
or Await::race()
to run multiple generators.
If you have a JavaScript background, you can think of Generator
objects as promises
and Await::all()
and Await::race()
are just Promise.all()
and Promise.race()
.
Await::all()
Await::all()
allows you to run an array of generators at the same time.
If you yield Await::all($array)
, your function resumes when
all generators in $array
have finished executing.
function loadData(string $name): Generator {
// some async logic
return strlen($name);
}
$array = [
"SOFe" => $this->loadData("SOFe"), // don't yield it yet!
"PEMapModder" => $this->loadData("PEMapModder"),
];
$results = yield from Await::all($array);
var_dump($result);
Output:
array(2) {
["SOFe"]=>
int(4)
["PEMapModder"]=>
int(11)
}
Yielding Await::all()
will throw an exception
as long as any of the generators throw.
The error condition will not wait until all generators return.
Await::race()
Await::race()
is like Await::all()
,
but it resumes as long as any of the generators return or throw.
The returned value of yield from
is a 2-element array containing the key and the value.
function sleep(int $time): Generator {
// Let's say this is an await version of `scheduleDelayedTask`
return $time;
}
function main(): Generator {
[$k, $v] = yield from Await::race([
"two" => $this->sleep(2),
"one" => $this->sleep(1),
]);
var_dump($k); // string(3) "one"
var_dump($v); // int(1)
}
Async iterators
In normal PHP functions, there is only a single return value.
If we want to return data progressively,
generators should have been used,
where the user can iterate on the returned generator.
However, if the user intends to perform async operations
in every step of progressive data fetching,
the next()
method needs to be async too.
In other languages, this is called "async generator" or "async iterator".
However, since await-generator has hijacked the generator syntax,
it is not possible to create such structures directly.
Instead, await-generator exposes the Traverser
class,
which is an extension to the normal await-generator syntax,
providing an additional yield mode Traverser::VALUE
,
which allows an async function to yield async iteration values.
A key (the current traversed value) is passed with Traverser::VALUE
.
The resultant generator is wrapped with the Traverser
class,
which provides an asynchronous next()
method that
executes the generator asynchronously and returns the next traversed value,
Example
In normal PHP, we may have an line iterator on a file stream like this:
function lines(string $file) : Generator {
$fh = fopen($file, "rt");
try {
while(($line = fgets($fh)) !== false) {
yield $line;
}
} finally {
fclose($fh);
}
}
function count_empty_lines(string $file) {
$count = 0;
foreach(lines($file) as $line) {
if(trim($line) === "") $count++;
}
return $count;
}
What if we have async versions of fopen
, fgets
and fclose
and want to reimplement this lines
function as async?
We would use the Traverser
class instead:
function async_lines(string $file) : Generator {
$fh = yield from async_fopen($file, "rt");
try {
while(true) {
$line = yield from async_fgets($fh);
if($line === false) {
return;
}
yield $line => Traverser::VALUE;
}
} finally {
yield from async_fclose($fh);
}
}
function async_count_empty_lines(string $file) : Generator {
$count = 0;
$traverser = new Traverser(async_lines($file));
while(yield from $traverser->next($line)) {
if(trim($line) === "") $count++;
}
return $count;
}
Interrupting a generator
Yielding inside finally
may cause a crash
if the generator is not yielded fully.
If you perform async operations in the finally
block,
you must drain the traverser fully.
If you don't want the iterator to continue executing,
you may use the yield $traverser->interrupt()
method,
which keeps throwing the first parameter
(SOFe\AwaitGenerator\InterruptException
by default)
into the async iterator until it stops executing.
Beware that interrupt
may throw an AwaitException
if the underlying generator catches exceptions during yield Traverser::VALUE
s
(hence consuming the interrupts).
It is not necessary to interrupt the traverser
if there are no finally
blocks containing yield
statements.
Versioning concerns
await-generator is guaranteed to be shade-compatible, backward-compatible and partly forward-compatible.
Await-generator uses generator objects for communication.
The values passed through generators (such as Await::ONCE
)
are constant strings that are guaranteed to remain unchanged within a major version.
Therefore, multiple shaded versions of await-generator can be used together.
New constants may be added over minor versions. Older versions will crash when they receive constants from newer versions.
Only Await::f2c
/Await::g2c
loads await-generator code.
Functions that merely yield
values from the Await
class
will not affect the execution logic.
Therefore, the version of await-generator
on which Await::f2c
/Await::g2c
is called
determines the highest version to use.
(For those who do not use virion framework and are confused: await-generator is versioned just like the normal semver for you.)