In favor of RFC "Scalar Type Hints"
February 09, 2015 —Cet article est aussi disponible en français.
The Scalar Type Hints RFC for PHP 7 has first been initialized in December 2014. It went on with version 0.2 at the middle of January 2015, after changing several major ideas, and is now in version 0.3, integrating return types, as RFC Return Type Declarations has been accepted a few days ago.
Warning: this post is about a feature that is currently (beginning of February 2015) being discussed for PHP 7. As for any proposal being discussed, this feature might get accepted, or an alternative version could be, or it could even be rejected – be it for PHP 7.0 or for a later version. The truth about this feature – or its absence – will be in the PHP manual and in migration guides.
I’ve been following this RFC (and the previous ones) with some interest, and, as I’ve taken some time to play with it a bit last week, building PHP from the sources of the corresponding Git branch, I’ll try summarizing here why I think it is interesting. Please note this is my personnal opinion.
Why (scalar) type-hints?
PHP has been supporting type-hints1 on complex types for quite some time already: PHP 5.0 for objects, 5.1 for array
and 5.4 for callable
. The question of extending this support to scalar types (integer, float, boolean and string) as been discussed several times those last years, each time with a bit more support.
For me, type-hints, scalar or not, can and/or must fulfill several objectives:
- Make code more explicit: when reading the prototype of a function2, which is generally displayed by my IDE when I’m writing a function call, without having to go read either its source-code or documentation, types-hints allow developers to know which type of data that function expects.
- Ensure that, in the function itself, the data we have at our disposal is of the expected type – without having to write the corresponding checks ourselves.
- Be compatible with PHP’s flexible philosophy and history.
- Not break compatibility of libraries that specify type-hints with the code calling them, code which might not necessarily work with strict typing.
With the proposition this RFC introduces, if it passes, a function expecting an integer could be declared this way:
function my_function_01(int $myint)
{
var_dump($myint);
}
If this function is called passing it something else than an integer as a parameter, the call will fail and a catchable error will be raised – like for the other kinds of already existing type-hints:
Catchable fatal error: Argument 1 passed to my_function_01() must be of the type integer, string given
Seeing this, this type-hinting idea for scalars doesn’t seem too hard to work with.
Some weak typing…
One of PHP’s great forces, one of the key elements of its accessibility, is its weak typing principle, where type-conversions are done when necessary.
Typically, with an URL like http://example.org/index.php?id=123
, the $_GET['id']
element will contain the string "123"
– which can be manipulated with pretty much all PHP functions as if it was an integer, without us having to worry about its real type: what matters is the data and the value it contains and the meaning we give them! This example also applies well to most of the database querying APIs, where results are often returned as strings.
This RFC has, in my eyes, the great advantage of going by default with weakly typed type-hints. This means, for the function presented a bit earlier, both following calls would be valid:
my_function_01(42);
my_function_01('42');
In both cases, the $myint
variable received by the function will be an integer: using the specified type-hint, PHP will do an automatic conversion of the given value, following the same rules in place everywhere else in the language!
This means the output will look like this:
int(42)
int(42)
Still, passing a value that cannot be converted to an integer will cause the error shown earlier to be raised. For instance, with:
my_function_01('plop');
We’ll get, I repeat, the following output:
Catchable fatal error: Argument 1 passed to my_function_01() must be of the type integer, string given
So, in a few words, this RFC introduces a flexible mecanism for type-hinting, that answers both needs:
- of the developers of the function, who want to receive some data of the type they specified,
- and of the developers who use this function, who want to keep using PHP’s weak typing approach.
This default flexible behavior also has the advantage of integrating well with PHP’s traditional approach and not breaking compatibility of existing code.
And some strict typing!
Still, even if PHP has an history and tradition of being flexible, many developers tend, since a few years, to work with a more strict approach to typing, using variables of the integer
type when it comes to storing integer values, variables of the float
type when it’s about storing decimal values, and so on.
At a certain level, this tendency is shown by the almost systematic use of strict comparison operators like ===
or !==
. At another level, it is also found in the appreciation for the type annotations feature of the Hack language.
Well, as it is, this RFC also answers this need for strict typing!
To allow switching to strict scalar type-hinting mode, a new declare()
3 directive is added: strict_types
.
Inside a declare(strict_types=1)
block, all function calls with scalar type-hints will be done in strict type mode. For example:
declare(strict_types=1) {
my_function_01('42');
}
The my_function_01()
function is defined as expecting an integer as a parameter. As we have given it a string after activating the strict typing mode, this call will cause an error to be raised:
Catchable fatal error: Argument 1 passed to my_function_01() must be of the type integer, string given,
called in .../test-01.php on line 9
and defined in .../test-01.php on line 17
Instead of wrapping portions of code in declare(strict_types=1)
blocks, using this directive at the beginning of a file is also possible – in which case all function calls made from this file will be in strict typing mode:
<?php
declare(strict_types=1);
// All functions/methods calls made from this
// file will be using strict typing
// Valid => int(42)
my_function_01(42);
// Invalid
my_function_01('42');
?>
With this directive, this RFC answers both the needs and wishes of those who prefer working with strict types, even if going against the flexible traditional usages of PHP, and of those who prefer working with weak types. In each function, as shown earlier, the types of data received as parameters are those defined and expected by the function’s authors.
A more consequent example?
To visualize things a bit better, let’s consider the following class, which would be part of a library:
// We could or not use a declare(strict_types=1) here, for
// function calls made from this file.
class MyLib
{
public function expectInt(int $myint) {
printf("%s( %s )\n\n", __METHOD__, str_replace("\n", " ", var_export($myint, true)));
}
public function expectFloat(float $myfloat) {
printf("%s( %s )\n\n", __METHOD__, str_replace("\n", " ", var_export($myfloat, true)));
}
public function expectString(string $mystring) {
printf("%s( %s )\n\n", __METHOD__, str_replace("\n", " ", var_export($mystring, true)));
}
}
We’ll also setup an error handler, tasked with intercepting (amongst others) catchable errors that will be raised if type-hints are not respected:
set_error_handler(function (int $errno , string $errstr, string $errfile, int $errline, array $errcontext) {
printf("Error:\n");
printf(" * errno = %s\n", var_export($errno, true));
printf(" * errstr = %s\n", var_export($errstr, true));
printf(" * errfile = %s\n", var_export($errfile, true));
printf(" * errline = %s\n", var_export($errline, true));
printf(" * errcontext = %s\n", str_replace("\n", " ", var_export($errcontext, true)));
return true; // continue working
});
We’ll now use this class from a file in weak typing mode (so, not using the declare()
directive):
// No declare() on top of this file => weak mode
$obj = new MyLib();
// Working calls
printf("Working calls:\n\n");
$obj->expectInt(123456);
$obj->expectFloat(3.1415);
$obj->expectString('Hello, World!');
$obj->expectString('123456789');
// Those calls will also work, as we are not in strict_types mode
// => The usual conversions will apply ("weak" typing)
printf("Other working calls:\n\n");
$obj->expectInt('123456'); // MyLib::expectInt( 123456 )
$obj->expectFloat('3.1415'); // MyLib::expectFloat( 3.1415000000000002 )
$obj->expectString(123456789); // MyLib::expectString( '123456789' )
// Appels qui ne fonctionnent pas, puisque les transtypages habituels ne sont pas possibles
printf("Calls that don't work:\n\n");
$obj->expectInt('abcdef');
$obj->expectFloat([123, 'hello']);
$obj->expectString($obj);
Executing this example will generate the following output (reformatted a bit, to facilitate reading):
Working calls:
MyLib::expectInt( 123456 )
MyLib::expectFloat( 3.1415000000000002 )
MyLib::expectString( 'Hello, World!' )
MyLib::expectString( '123456789' )
Other working calls:
MyLib::expectInt( 123456 )
MyLib::expectFloat( 3.1415000000000002 )
MyLib::expectString( '123456789' )
Calls that don't work:
Error:
* errno = 4096
* errstr = 'Argument 1 passed to MyLib::expectInt() must be of the type integer, string given, called in .../test-02-nostrict.php on line 34 and defined'
* errfile = '.../lib.php'
* errline = 6
* errcontext = array ( )
MyLib::expectInt( 'abcdef' )
Error:
* errno = 4096
* errstr = 'Argument 1 passed to MyLib::expectFloat() must be of the type float, array given, called in .../test-02-nostrict.php on line 35 and defined'
* errfile = '.../lib.php'
* errline = 10
* errcontext = array ( )
MyLib::expectFloat( array ( 0 => 123, 1 => 'hello', ) )
Error:
* errno = 4096
* errstr = 'Argument 1 passed to MyLib::expectString() must be of the type string, object given, called in .../test-02-nostrict.php on line 36 and defined'
* errfile = '.../lib.php'
* errline = 14
* errcontext = array ( )
MyLib::expectString( MyLib::__set_state(array( )) )
This clearly shows that in weak typing mode, when automatic type conversions were possible, PHP used them. On the other hand, when there were no possible type conversions, errors have been raised, preventing (if my error handler hadn’t naively returned true
) the executions of functions called with invalid parameters.
Now, let’s see some function calls from a file in strict typing mode:
declare(strict_types=1); // strict mode !
$obj = new MyLib();
// Working calls
printf("Working calls:\n\n");
$obj->expectInt(123456);
$obj->expectFloat(3.1415);
$obj->expectString('Hello, World!');
$obj->expectString('123456789');
// Calls that won't work, as we are here in strict_types mode
printf("Calls that don't work:\n\n");
$obj->expectInt('abcdef');
$obj->expectFloat([123, 'hello']);
$obj->expectString(123456789);
The output we’ll get, this time, will look like this (once again, modified a bit to facilitate reading):
Working calls:
MyLib::expectInt( 123456 )
MyLib::expectFloat( 3.1415000000000002 )
MyLib::expectString( 'Hello, World!' )
MyLib::expectString( '123456789' )
Calls that don't work:
Error:
* errno = 4096
* errstr = 'Argument 1 passed to MyLib::expectInt() must be of the type integer, string given, called in .../test-02-nostrict.php on line 34 and defined'
* errfile = '.../lib.php'
* errline = 6
* errcontext = array ( )
MyLib::expectInt( 'abcdef' )
Error:
* errno = 4096
* errstr = 'Argument 1 passed to MyLib::expectFloat() must be of the type float, array given, called in .../test-02-nostrict.php on line 35 and defined'
* errfile = '.../lib.php'
* errline = 10
* errcontext = array ( )
MyLib::expectFloat( array ( 0 => 123, 1 => 'hello', ) )
Error:
* errno = 4096
* errstr = 'Argument 1 passed to MyLib::expectString() must be of the type string, integer given, called in .../test-02-nostrict.php on line 36 and defined'
* errfile = '.../lib.php'
* errline = 14
* errcontext = array ( )
MyLib::expectString( 123456789 )
This time, the only calls that have worked are those where data types, and not only their values, did match the type-hints specified when declaring the methods: no type conversion has automatically been done!
Strict typing… by calling file? Or by function called?
An idea as been discussed several times: it could be up to each function to decide if, when it’s called, types must be checked in weak (if automatic conversions can be applied) or in strict (if an error must be raised when a type is incorrect, even if the value is acceptable) mode.
With such a proposition, it’s the authors of a library who would decide if it can be used by an application developed with a weak types logic or if it may only be used by applications written with a strict approach to typing.
In other words, if tomorrow the authors of one of the numerous libraries I use in my projects were to start using type-hints, they could decide that my application, in which I sometimes have integer values in variables of string type, is not worth using their library – and that errors should be raised here and there!
I don’t understand how this could be a good idea – or even a viable idea in anyway, for that matters!
Type-hints in a function’s declaration are there to:
- indicate the callers what kind of values it expects,
- and to guarantee data as seen by the function are of the specified types.
Having or not an automatic conversion doesn’t have to be visible at the level of the called function: only the callers can decide which behavior is adapted: it’s them who know if their application works with a weak or strict typing logic!
In this mater, the declare()
directive, that has to be set on top of each file (or around each block) where calls can/must be made in strict mode, really answers our needs well: it offers possibilities, without at the same time imposing more constraints than we can accept.
And for return types?
As RFC Return Type Declarations has been accepted while this one was still under discussion, a v0.3 has been published, adding scalar type-hints to return types.
The general idea is the same than for parameters types: in weak mode, conversions will automatically be applied if they are needed and possible, while in strict mode, an error will be raised if a function tries to return a value that’s not of the right type.
The major difference resides in the fact the weak or strict mode is not decided when the function is called, but when it’s declared. This is because the persons that are best able to determine whether their function or library returns the right types or needs conversions really are its authors!
For example, let’s consider the two following functions’ declarations, both written while we are in weak mode:
// No specific declare() => weak typing mode
// Function that returns an integer, like planned
// => Everything is OK!
function my_function_01(int $a, int $b) : int
{
return $a + $b;
}
// Function that tries to return a string instead of an integer
// It's declared while we are in weak mode => conversion
function my_function_02(int $a, int $b) : int
{
return strval($a + $b);
}
We can call these two functions, while we are either in weak or in strict mode:
var_dump( my_function_01(2, 5) );
var_dump( my_function_02(2, 5) );
And the output will be the same in both cases:
int(7)
int(7)
The second function tried to return a string instead of an integer, but as it’s been declared in weak mode and the string could be converted to an integer, this conversion has taken place.
On the other hand, if we had defined those two same functions while being in strict mode, our two calls would have produced the following output (no matter if they had been made from a weak or a strict portion of code):
int(7)
Catchable fatal error: Return value of my_function_02() must be of the type integer, string returned
in .../return-02.php on line 15
As you’ll have understood, the second call failed, as the function was declared to return an integer, in strict mode, and tried to return a string.
declare()
, use strict
, … directives
During discussions I’ve read and/or participated to, I’ve several times seen that declare(strict_types=1)
would be ugly, that it would not be easy to memorize or type, or that a use strict
might be better.
With a tiny bit of bad faith, if I could answer with only one thing, it would be:
The
\
namespaces separator introduced in PHP 5.3, while PHP 5.3 was already in alpha versions phase and::
had been evoked for a very long time, everyone said it was ugly and un-easy to type…Still, a few years later, we all got used to it and we can’t say namespaces have not been a real success ;-)
With some additional arguments:
use strict
might cause some confusion, asuse
is already used for namespaces, today – it’s even possible to declare a namespace calledstrict
, even if trying to import it without aliasing it causes aFatal error: You seem to be trying to use a different language...
;-)- Using an option called
strict
and notstrict_types
would be lying: there are many things we could want to make more strict in PHP, and here, the proposition is about activating a strict mode for only one of them. - After typing this directive a few times, it will naturally fall under our fingers4 – and once PHP 7 is released, I’m sure our IDEs will know how to auto-complete it ;-)
In other words, even if this directive feels a bit strange at first sight, I have no doubt we’ll easily get used to it, especially considering the advantages it bring!
What now?
First of all, this RFC has entered voting phase a few days ago. Votes will end on February 19.
If this RFC is accepted, it will be possible to think about going a bit farther, especially re-thinking about the idea of nullables types – idea that might eventually arrive with a later minor version, such as PHP 7.1 ou 7.2.
For now, votes are at 41 yes and 27 no. Considering 2/3 positives votes are required for the RFC to pass, things are not yet decided and could go either one way or the other!
At the same time, even if the RFC is in its voting phase, additional ideas are still being discussed, like, no later than yesterday evening:
- Possibly adding a
numeric
type-hint for numbers, accepting both integers and floats. I don’t really have any opinion on this (for now). - What could be a switch to a marker on PHP’s opening tag, like
<?php strict
. Why not, but maybe more with something like<?php strict_types
, to avoid closing the door in front of other ideas that might want to make PHP more strict?
My opinion, in a few words
I have followed discussions about the idea of type-hinting for scalars for a while and I have to admit that, this time, I think this proposition is pretty good, mostly thanks to the following points:
- This feature is not enabled by default; so, it doesn’t break existing code.
- Type-hints ensure that data received by functions that specify them in their declarations will be of the type they expect, no matter if they are called in weak or strict mode.
- Using a flexible typing mechanism by default fits well with PHP’s spirit, won’t be destabilizing for beginners, and will allow us to gradually adapt libraries and/then applications.
- It’s the callers who indicate when they’re ready to switch to strict typing for calls they do, once their code works with strict types.
- And it’s the authors of each library who determine – and they are the best placed to take this decision – if they think their code will or not work with strict return types.
I would add that Andrea, author of this RFC, did a really great job, be it on the RFC itself, which includes much details, on discussions about her proposition, and in dealing with the large amount of feedback it has generated!
Well, in conclusion, I really hope this RFC will pass!
-
As a matter of facts, PHP’s existing type-hints are not really hints, but more like strong checks, as errors are raised if they are not respected. ↩︎
-
In this post, I’ll talk about functions or methods, depending on the examples I’ll present. The principle is exactly the same in both cases, as a method is pretty much nothing more than a function placed inside an object. ↩︎
-
The
declare()
instruction itself is not new: it has existed for quite some time and aims to alter the behavior of PHP’s engine. It is not well-known though, as the currently existing directives are not really useful nor used. ↩︎ -
I can already type
declare(strict_types=1)
without having to think much… And I’ve only spent a few half-hours playing with it! ↩︎