Jsctypes/api
js-ctypes is a library for calling C/C++ functions from JavaScript without having to write or generate any C/C++ "glue code".
js-ctypes is already in mozilla-central, but the API is subject to change. This page contains design proposals for the eventual js-ctypes API.
Contents
Libraries
ctypes.open(name)
- Open a library. (TODO: all the details) This always returns aLibrary
object or throws an exception.
Library
objects have the following methods:
lib.declare(name, abi, rtype, [argtype1, ...])
- Declare a function. (TODO: all the details) This always returns a new callableCData
object representing a function pointer to name, or throws an exception.
- If rtype is an array type, this throws a
TypeError
.
- If any argtypeN is an array type, the result is the same as if it had been the corresponding pointer type,
argtypeN.elementType.ptr
. (Rationale: This is how C and C++ treat array types in function declarations.)
(TODO: Explain what happens when you call a declared function. In brief: It uses ImplicitConvert
to convert the JavaScript arguments to C and ConvertToJS
to convert the return value to JS.)
Types
A type maps JS values to C/C++ values and vice versa. They're used when declaring functions. They can also be used to create and populate C/C++ data structures entirely from JS.
(Types and their prototypes are extensible: scripts can add new properties to them. Rationale: This is how most JavaScript constructors behave.)
Built-in types
ctypes provides the following types:
ctypes.int8_t, uint8_t, int16_t, uint16_t, int32_t, uint32_t, int64_t, uint64_t, float32_t, float64_t
- Primitive numeric types that behave the same way on all platforms (with the usual caveat that every platform has slightly different floating-point behavior, in corner cases, and there's a limit to what we can realistically do about it).
- Since some 64-bit values are outside the range of the JavaScript number type,
ctypes.int64_t
andctypes.uint64_t
do not autoconvert to JavaScript numbers. Instead, they convert to objects of the wrapper typesctypes.Int64
andctypes.UInt64
(which are JavaScript object types, notCType
s). See "64-bit integer objects" below.
ctypes.size_t, ssize_t, intptr_t, uintptr_t
- Primitive types whose size depends on the platform. (These types do not autoconvert to JavaScript numbers. Instead they convert to wrapper objects, even on 32-bit platforms. See "64-bit integer objects" below. Rationale: On 64-bit platforms, there are values of these types that cannot be precisely represented as JS numbers. It will be easier to write code that works on multiple platforms if the builtin types autoconvert in the same way on all platforms.)
ctypes.bool, short, unsigned_short, int, unsigned, unsigned_int, long, unsigned_long, float, double
- Types that behave like the corresponding C types. As in C,unsigned
is always an alias forunsigned_int
.
- (
ctypes.long
andctypes.unsigned_long
autoconvert to 64-bit integer objects on all platforms. The rest autoconvert to JavaScript numbers. Rationale: Some platforms have 64-bitlong
and some do not.)
ctypes.char, ctypes.signed_char, ctypes.unsigned_char
- Character types that behave like the corresponding C types. (These are very much likeint8_t
anduint8_t
, but they differ in some details of conversion. For example,ctypes.char.array(30)(str)
converts the string str to UTF-8 and returns a newCData
object of array type.)
ctypes.char16_t
- A 16-bit unsigned character type representing a UTF-16 code unit. (This is distinct fromuint16_t
in details of conversion behavior. js-ctypes autoconverts Cchar16_t
s to JavaScript strings of length 1.) For backwards compatibility,ctypes.jschar
is an alias forchar16_t
.
ctypes.void_t
- The special C typevoid
. This can be used as a return value type. (void
is a keyword in JavaScript.)
ctypes.voidptr_t
- The C typevoid *
.
The wrapped integer types are the types int64_t
, uint64_t
, size_t
, ssize_t
, intptr_t
, uintptr_t
, long
, and unsigned_long
. These are the types that autoconvert to 64-bit integer objects rather than to primitive JavaScript numbers.
User-defined types
Starting from the builtin types above, these functions can be used to create additional types:
new ctypes.PointerType(t)
- If t is aCType
, return the type "pointer to t". The result is cached so that future requests for this pointer type produce the sameCType
object. If t is a string, instead return a new opaque pointer type named t. Otherwise throw aTypeError
.
new ctypes.FunctionType(abi, rt, [ at1, ... ])
- Return a function pointerCType
corresponding to the C typert (*) (at1, ...)
, where abi is a ctypes ABI type and rt and at1, ... areCType
s. Otherwise throw aTypeError
.
new ctypes.ArrayType(t)
- Return an array type with unspecified length and element type t. If t is not a type ort.size
isundefined
, throw aTypeError
.
new ctypes.ArrayType(t, n)
- Return the array type t[n]. If t is not a type ort.size
isundefined
or n is not a size value (defined below), throw aTypeError
. If the size of the resulting array type, in bytes, would not be exactly representable both as asize_t
and as a JavaScript number, throw aRangeError
.
- A size value is either a non-negative, integer-valued primitive number, an
Int64
object with a non-negative value, or aUInt64
object.
- (Array types with 0 elements are allowed. Rationale: C/C++ allow them, and it is convenient to be able to pass an array to a foreign function, and have it autoconverted to a C array, without worrying about the special case where the array is empty.)
new ctypes.StructType(name, fields)
- Create a new struct type with the given name and fields. fields is an array of field descriptors, of the format
[ { field1: type1 }, { field2: type2 }, ... ]
- where
fieldn
is a string denoting the name of the field, andtypen
is a ctypes type. js-ctypes calculates the offsets of the fields from its encyclopedic knowledge of the architecture's struct layout rules. If name is not a string, or anytypen
is such thattypen.size
isundefined
, throw aTypeError
. If the size of the struct, in bytes, would not be exactly representable both as asize_t
and as a JavaScript number, throw aRangeError
.
(Open issue: Specify a way to tell ctypes.StructType
to use #pragma pack(n)
.)
These constructors behave exactly the same way when called without the new
keyword.
Examples:
const DWORD = ctypes.uint32_t; const HANDLE = new ctypes.PointerType("HANDLE"); const HANDLES = new ctypes.ArrayType(HANDLE); const FILE = new ctypes.StructType("FILE").ptr; const IOBuf = new ctypes.ArrayType(ctypes.uint8_t, 4096); const struct_tm = new ctypes.StructType('tm', [{'tm_sec': ctypes.int}, ...]); const comparator_t = new ctypes.FunctionType(ctypes.default_abi, ctypes.int, [ ctypes.voidptr_t, ctypes.voidptr_t ]);
Properties of types
All the fields described here are read-only.
All types have these properties and methods:
t.size
- The C/C++sizeof
the type, in bytes. The result is a primitive number, not aUInt64
object.
- If t is an array type with unspecified length,
t.size
isundefined
.
ctypes.void_t.size
isundefined
.
t.name
- A string, the type's name. It's intended that in ordinary use, this will be a C/C++ type expression, but it's not really meant to be machine-readable in all cases.
- For primitive types this is just the name of the corresponding C/C++ type.
- For struct types and opaque pointer types, this is simply the string that was passed to the constructor. For other function, pointer, and array types this should try to generate valid C/C++ type expressions, which isn't exactly trivial.
- (Open issue: This conflicts with the usual meaning of .name for functions, and types are callable like functions.)
ctypes.int32_t.name ===> "int32_t" ctypes.void_t.name ===> "void" ctypes.char16_t.ptr.name ===> "char16_t *" const FILE = new ctypes.StructType("FILE").ptr; FILE.name ===> "FILE*" const fn_t = new ctypes.FunctionType(ctypes.stdcall, ctypes.int, [ ctypes.voidptr_t, ctypes.voidptr_t ]); fn_t.name ===> "int (__stdcall *)(void*, void*)" const struct_tm = new ctypes.StructType("tm", [{tm_sec: ctypes.int}, ...]); struct_tm.name ===> "tm" // Pointer-to-array types are not often used in C/C++. // Such types have funny-looking names. const ptrTo_ptrTo_arrayOf4_strings = new ctypes.PointerType( new ctypes.PointerType( new ctypes.ArrayType(new ctypes.PointerType(ctypes.char), 4))); ptrTo_ptrTo_arrayOf4_strings.name ===> "char *(**)[4]"
t.ptr
- Returnctypes.PointerType(t)
.
t.array()
- Returnctypes.ArrayType(t)
.
t.array(n)
- Returnctypes.ArrayType(t, n)
.
- Thus a quicker (but still almost as confusing) way to write the type in the previous example would be:
const ptrTo_ptrTo_arrayOf4_strings = ctypes.char.ptr.array(4).ptr.ptr;
- (
.array()
requires parentheses but.ptr
doesn't. Rationale:.array()
has to be able to handle an optional parameter. Note that in C/C++, to write an array type requires brackets, optionally with a number in between:int [10]
-->ctypes.int.array(10)
. Writing a pointer type does not require the brackets.)
t.toString()
- Return"type " + t.name
.
t.toSource()
- Return a JavaScript expression that evaluates to aCType
describing the same C/C++ type as t.
ctypes.uint32_t.toSource() ===> "ctypes.uint32_t" ctypes.string.toSource() ===> "ctypes.string" const charPtr = new ctypes.PointerType(ctypes.char); charPtr.toSource() ===> "ctypes.char.ptr" const Point = new ctypes.StructType( "Point", [{x: ctypes.int32_t}, {y: ctypes.int32_t}]); Point.toSource() ===> "ctypes.StructType("Point", [{x: ctypes.int32_t}, {y: ctypes.int23_t}])"
Pointer types also have:
t.targetType
- Read-only. The pointed-to type, ornull
if t is an opaque pointer type.
Function types also have:
t.abi
- Read-only. The ABI of the function; one of the ctypes ABI objects.
t.returnType
- Read-only. The return type.
t.argTypes
- Read-only. A sealed array of argument types.
Struct types also have:
t.fields
- Read-only. A sealed array of field descriptors. (TODO: Details.)
Array types also have:
t.elementType
- The type of the elements of an array of this type. E.g.IOBuf.elementType === ctypes.uint8_t
.
t.length
- The number of elements, a non-negative integer; orundefined
if this is an array type with unspecified length. (The result, if notundefined
, is a primitive number, not aUInt64
object. Rationale: Having.length
produce anything other than a number is foreign to JS, and arrays of more than 253 elements are currently unheard-of.)
Minutiae:
ctypes.CType
is the abstract-base-class constructor of all js-ctypes types. If called, it throws aTypeError
. (This is exposed in order to exposectypes.CType.prototype
.)
- The [[Class]] of a ctypes type is
"CType"
.
- The [[Class]] of the type constructors
ctypes.{C,Array,Struct,Pointer}Type
is"Function"
.
- Every
CType
has a read-only, permanent.prototype
property. The type-constructorsctypes.{C,Pointer,Struct,Array}Type
each have a read-only, permanent.prototype
property as well.
- Types have a hierarchy of prototype objects. The prototype of
ctypes.CType.prototype
isFunction.prototype
. The prototype ofctypes.{Array,Struct,Pointer,Function}Type.prototype
and of all the builtin types exceptctypes.voidptr_t
isctypes.CType.prototype
. The prototype of an array type isctypes.ArrayType.prototype
. The prototype of a struct type isctypes.StructType.prototype
. The prototype of a pointer type isctypes.PointerType.prototype
. The prototype of a function type isctypes.FunctionType.prototype
.
- Every
CType
t hast.prototype.constructor === t
; that is, its.prototype
has a read-only, permanent, own.constructor
property that refers to the type. The same is true of the five type constructorsctypes.{C,Array,Struct,Pointer,Function}Type
.
Calling types
CType
s are JavaScript constructors. That is, they are functions, and they can be called to create new objects. (The objects they create are called CData
objects, and they are described in the next section.)
new t
ornew t()
ort()
- Create a newCData
object of type t.
- Without arguments, these allocate a new buffer of
t.size
bytes, populate it with zeroes, and return a newCData
object referring to the complete object in that buffer.
- If
t.size
isundefined
, this throws aTypeError
.
new t(val)
ort(val)
- Create a newCData
object as follows:
- If
t.size
is notundefined
: Convert val to type t by callingExplicitConvert(val, t)
, throwing aTypeError
if the conversion is impossible. Allocate a new buffer oft.size
bytes, populated with the converted value. Return a newCData
object of type t referring to the complete object in that buffer. (When val is aCData
object of type t, the behavior is likemalloc
followed bymemcpy
.)
- If
- If t is an array type of unspecified length:
- If val is a size value (defined above): Let u =
ArrayType(t.elementType, val)
and returnnew u
.
- If val is a size value (defined above): Let u =
- If
t.elementType
ischar16_t
and val is a string: Return a newCData
object of typeArrayType(ctypes.char16_t, val.length + 1)
containing the contents of val followed by a null character.
- If
- If
t.elementType
is an 8-bit character type and val is a string: If val is not a well-formed UTF-16 string, throw aTypeError
. Otherwise, let s = a sequence of bytes, the result of converting val from UTF-16 to UTF-8, and let n = the number of bytes in s. Return a newCData
object of typeArrayType(t.elementType, n + 1)
containing the bytes in s followed by a null character.
- If
- If val is a JavaScript array object and
val.length
is a nonnegative integer, let u =ArrayType(t.elementType, val.length)
and returnnew u(val)
. (ArrayCData
objects created in this way havecdata.constructor === u, not t. Rationale: For all
CData
objects,cdata.constructor.size
gives the size in bytes, unless a struct field shadowscdata.constructor
.)
- If val is a JavaScript array object and
- Otherwise, throw a
TypeError
.
- Otherwise, throw a
- Otherwise, t is
void_t
. Throw aTypeError
.
- Otherwise, t is
let a_t = ctypes.ArrayType(ctypes.int32_t); let a = new a_t(5); a.length ===> 5 a.constructor.size ===> 20
CData objects
A CData
object represents a C/C++ value located in memory. The address of the C/C++ value can be taken (using the .address()
method), and it can be assigned to (using the .value
property).
Every CData
object has a type, the CType
object that describes the type of the C/C++ value.
Minutiae:
- The [[Class]] of a
CData
object is"CData"
.
- The prototype of a
CData
object is the same as its type's.prototype
property.
(Implementation notes: A CData
object has a reserved slot that points to its type; a reserved slot that contains null
if the object owns its own buffer, and otherwise points to the base CData
object that owns the backing buffer where the data is stored; and a data pointer. The data pointer points to the actual location within the buffer of the C/C++ object to which the CData
object refers. Since the data pointer might not be aligned to 2 bytes, PRIVATE_TO_JSVAL is insufficient; a custom JSClass.trace hook will be needed. If the object owns its own buffer, its finalizer frees it. Other CData
objects that point into the buffer keep the base CData
, and therefore the underlying buffer, alive.)
Properties and methods of CData objects
All CData
objects have these methods and properties:
cdata.address()
- Return a newCData
object of the pointer typectypes.PointerType(cdata.constructor)
whose value points to the C/C++ object referred to by cdata.
- (Open issue: Does this pointer keep cdata alive? Currently not but we could easily change it. It is impossible to have all pointers keep their referents alive in a totally general way--consider pointers embedded in structs and arrays. But this special case would be pretty easy to hack: put a
.contents
property on the resulting pointer, referring back to cdata.)
cdata.constructor
- Read-only. The type of cdata. (This is nevervoid_t
or an array type with unspecified length. Implementation note: The prototype of cdata is an object that has a read-onlyconstructor
property, as detailed under "minutiae".)
cdata.toSource()
- Return the string "t(arg)" where t and arg are implementation-defined JavaScript expressions (intended to represent the type ofcdata
and its value, respectively). The intent is thateval(cdata.toSource())
should ideally produce a newCData
object containing a copy of cdata, but this can only work if the type ofcdata
happens to be bound to an appropriate name in scope.
cdata.toString()
- Return the same string ascdata.toSource()
.
The .value
property has a getter and a setter:
cdata.value
- Let x =ConvertToJS(cdata)
. Ifx === cdata
, throw aTypeError
. Otherwise return x.
cdata.value = val
- Let cval =ImplicitConvert(val, cdata.constructor)
. If conversion fails, throw aTypeError
. Otherwise assign the value cval to the C/C++ object referred to by cdata.
Structs
CData
objects of struct types also have this method:
cstruct.addressOfField(name)
- Return a newCData
object of the appropriate pointer type, whose value points to the field of cstruct with the name name. If name is not a JavaScript string or does not name a member of cstruct, throw aTypeError
.
They also have getters and setters for each struct member:
cstruct.member
- Let F be aCData
object referring to the struct member. ReturnConvertToJS(F)
.
cstruct.member = val
- Let cval =ImplicitConvert(val, the type of the member)
. If conversion fails, throw aTypeError
. Otherwise store cval in the appropriate member of the struct.
These getters and setters can shadow the properties and methods described above.
Pointers
CData
objects of pointer types also have this property:
cptr.contents
- Let C be aCData
object referring to the pointed-to contents of cptr. ReturnConvertToJS(C)
.
cptr.contents = val
- Let cval =ImplicitConvert(val, the base type of the pointer)
. If conversion fails, throw aTypeError
. Otherwise store cval in the pointed-to contents of cptr.
Functions
CData
objects of function types are callable:
let result = cfn(arg1, ...)
- Let (carg1, ...) beCData
objects representing the arguments to the C function cfn, and cresult be aCData
object representing its return value. Let cargn =ImplicitConvert(argn, the type of the argument)
, and let result =ConvertToJS(cresult)
. Call the C function with arguments represented by (carg1, ...), and store the result in cresult. If conversion fails, throw aTypeError
.
Arrays
Likewise, CData
objects of array types have getters and setters for each element. Arrays additionally have a length
property.
Note that these getters and setters are only present for integers i in the range 0 ≤ i < carray.length
. (Open issue: can we arrange to throw an exception if i is out of range?)
carray[i]
- Let E be aCData
object referring to the element at index i. ReturnConvertToJS(E)
.
carray[i] = val
- Let cval =ImplicitConvert(val, carray.elementType)
. If conversion fails, throw aTypeError
. Otherwise store cval in element i of the array.
carray.length
- Read-only. The length of the array as a JavaScript number. (The same ascarray.constructor.length
. This is not aUInt64
object. Rationale: ArrayCData
objects should behave like other array-like objects for easy duck typing.)
carray.addressOfElement(i)
- Return a newCData
object of the appropriate pointer type (ctypes.PointerType(carray.constructor.elementType)
) whose value points to element i of carray. If i is not a JavaScript number that is a valid index of carray, throw aTypeError
.
(TODO: specify a way to read a C/C++ string and transcode it into a JS string.)
Aliasing
Note that it is possible for several CData
objects to refer to the same or overlapping memory. (In this way CData
objects are like C++ references.) For example:
const Point = new ctypes.StructType( "Point", [[ctypes.int32_t, 'x'], [ctypes.int32_t, 'y']]); const Rect = new ctypes.StructType( "Rect", [[Point, 'topLeft'], [Point, 'bottomRight']]); var r = Rect(); // a new CData object of type Rect var p = r.topLeft; // refers to the topLeft member of r, not a copy r.topLeft.x = 100; // This would not work if `r.topLeft` was a copy! r.topLeft.x ===> 100 // It works... p.x // and p refers to the same C/C++ object... ===> 100 // so it sees the change as well. r.toSource() ===> "Rect({topLeft: {x: 100, y: 0}, bottomRight: {x: 0, y: 0}})" p.x = 1.0e90; // Assigning a value out of range is an error. **** TypeError // The range checking is great, but it can have surprising // consequences sometimes: p.x = 0x7fffffff; // (the maximum int32_t value) p.x++; // p.x = 0x7fffffff + 1, which is out of range... **** TypeError // ...so this fails, leaving p.x unchanged. // But JS code doesn't need to do that very often. // To make this to roll around to -0x80000000, you could write: p.x = (p.x + 1) | 0; // In JS, `x|0` truncates a number to int32.
Casting
ctypes.cast(cdata, t)
- Return a newCData
object which points to the same memory block as cdata, but with type t. Ift.size
is undefined or larger thancdata.constructor.size
, throw aTypeError
. This is like a C cast or a C++reinterpret_cast
.
Equality
According to the ECMAScript standard, if x and y are two different objects, then x === y
and x == y
are both false. This has consequences for code that uses js-ctypes pointers, pointer-sized integers, or 64-bit integers, because all these values are represented as JavaScript objects. In C/C++, the ==
operator would compare values of these types for equality. Not so in js-ctypes:
const HANDLE = new ctypes.PointerType("HANDLE"); const INVALID_HANDLE_VALUE = HANDLE(-1); const kernel32 = ctypes.open("kernel32"); const CreateMutex = kernel32.declare("CreateMutex", ...); var h = CreateMutex(null, false, null); if (h == INVALID_HANDLE_VALUE) // BAD - always false ...
This comparison is always false because CreateMutex
returns a new CData
object, which of course will be a different object from the existing value of INVALID_HANDLE_VALUE
.
(Python ctypes has the same issue. It isn't mentioned in the docs, but:
>>> from ctypes import * >>> c_void_p(0) == c_void_p(0) False >>> c_int(33) == c_int(33) False
We could overload operator== using the nonstandard hook JSExtendedClass.equality
but it might not be worth it.)
64-bit integer objects
Since JavaScript numbers are floating-point values, they cannot precisely represent all 64-bit integer values. Therefore 64-bit and pointer-sized C/C++ values of numeric types do not autoconvert to JavaScript numbers. Instead they autoconvert to JavaScript objects of type ctypes.Int64
and ctypes.UInt64
.
Int64
and UInt64
objects are immutable.
It's not possible to do arithmetic Int64Object
s using the standard arithmetic operators. JavaScript does not have operator overloading (yet). A few convenience functions are provided. (These types are intentionally feature-sparse so that they can be drop-in-replaced with a full-featured bignum type when JavaScript gets one.)
Int64
ctypes.Int64(n)
ornew ctypes.Int64(n)
- If n is an integer-valued number such that -263 ≤ n < 263, return a sealedInt64
object with that value. Otherwise if n is a string consisting of an optional minus sign followed by either decimal digits or"0x"
or"0X"
and hexadecimal digits, and the string represents a number within range, convert the string to an integer and construct anInt64
object as above. Otherwise if n is anInt64
orUInt64
object, and represents a number within range, use the value to construct anInt64
object as above. Otherwise throw aTypeError
.
Int64
objects have the following methods:
i64.toString([radix])
- If radix is omitted, assume 10. Return a string representation of a in base radix, consisting of a leading minus sign, if the value is negative, followed by one or more lowercase digits in base radix.
i64.toSource()
- Return a string. (This is provided for debugging purposes, and programs should not rely on details of the resulting string, which may change in the future.)
The following functions are also provided:
ctypes.Int64.compare(a, b)
- If a and b are bothInt64
objects, return-1
if a < b,0
if a = b, and1
if a > b. Otherwise throw aTypeError
.
ctypes.Int64.lo(a)
- If a is anInt64
object, return the low 32 bits of its value. (The result is an integer in the range 0 ≤ result < 232.) Otherwise throw aTypeError
.
ctypes.Int64.hi(a)
- If a is anInt64
object, return the high 32 bits of its value (likea >> 32
). Otherwise throw aTypeError
.
ctypes.Int64.join(hi, lo)
- If hi is an integer-valued number in the range -231 ≤ hi < 231 and lo is an integer-valued number in the range 0 ≤ lo < 232, return a sealedInt64
object whose value is hi × 232 + lo. Otherwise throw aTypeError
.
UInt64
UInt64
objects are the same except that the hi values are in the range 0 ≤ hi < 232 and the .toString()
method never produces a minus sign.
Conversions
These functions are not exactly JS functions or C/C++ functions. They're algorithms used elsewhere in the spec.
ConvertToJS(x)
- This function is used to convert a CData
object or a C/C++ return value to a JavaScript value. The intent is to return a simple JavaScript value whenever possible without loss of data or different behavior on different platforms, and a CData
object otherwise. The precise rules are:
- If the type of x is
void
, returnundefined
.
- If the type of x is
bool
, return the corresponding JavaScript boolean.
- If x is of a number type but not a wrapped integer type, return the corresponding JavaScript number.
- If x is a signed wrapped integer type (
long
,int64_t
,ssize_t
, orintptr_t
), return actypes.Int64
object with value x.
- If x is an unsigned wrapped integer type (
unsigned long
,uint64_t
,size_t
, oruintptr_t
), return actypes.UInt64
object with value x.
- If x is of type
char16_t
, return a JavaScript string of length 1 containing the value of x (likeString.fromCharCode(x)
).
- If x is of any other character type, return the JavaScript number equal to its integer value. (This is sensitive to the signedness of the character type. Also, we assume no character types are so wide that they don't fit into a JavaScript number.)
- Otherwise x is of an array, struct, or pointer type. If the argument x is already a
CData
object, return it. Otherwise allocate a buffer containing a copy of the C/C++ value x, and return aCData
object of the appropriate type referring to the object in the new buffer.
Note that null C/C++ pointers do not convert to the JavaScript null
value. (Open issue: Should we? Is there any value in retaining the type of a particular null pointer?)
(Arrays of characters do not convert to JavaScript strings. Rationale: Suppose x
is a CData
object of a struct type with a member a
of type char[10]
. Then x.a[1]
should return the character in element 1 of the array, even if x.a[0]
is a null character. Likewise, x.a[0] = '\0';
should modify the contents of the array. Both are possible only if x.a
is a CData
object of array type, not a JavaScript string.)
ImplicitConvert(val, t)
- Convert the JavaScript value val to a C/C++ value of type t. This is called whenever a JavaScript value of any kind is passed to a parameter of a ctypes-declared function, passed to cdata.value = val
, or assigned to an array element or struct member, as in carray[i] = val
or cstruct.member = val
.
This function is intended to lose precision only when there is no reasonable alternative. It generally does not coerce values of one type to another type.
C/C++ values of all supported types round trip through ConvertToJS
and ImplicitConvert
without any loss of data. That is, for any C/C++ value v of type t, ImplicitConvert(ConvertToJS(v), t)
produces a copy of v. (Note that not all JavaScript can round-trip to C/C++ and back in an analogous way. JavaScript primitive numbers can round-trip to double
on all current platforms, Int64
objects to int64_t
, JavaScript booleans to bool
, and so on. But some JavaScript values, such as functions, cannot be ImplicitConvert
ed to any C/C++ type without loss of data.)
t must not be void
or an array type with unspecified length. (Rationale: C/C++ variables and parameters cannot have such types. The parameter of a function declared int f(int x[])
is int *
, not int[]
.)
- First, if val is a
CData
object of type u andSameType(t, u)
, return the current value of the C/C++ object referred to by val. Otherwise the behavior depends on the target type t.
- If t is
ctypes.bool
:
- If val is a boolean, return the corresponding C/C++ boolean value.
- If val is the number +0 or -0, return
false
. - If val is the number 1, return
true
. - Otherwise fail.
- If t is a numeric type:
- If val is a boolean, the result is a 0 or 1 of type t.
- If val is a
CData
object of a numeric type, and every value of that type is precisely representable in type t, the result is a precise representation of the value of val in type t. (This is more conservative than the implicit integer conversions in C/C++ and more conservative than what we do if val is a JavaScript number. This is sensitive to the signedness of the two types.) - If val is a number that can be exactly represented as a value of type t, the result is that value.
- If val is an
Int64
orUInt64
object whose value can be exactly represented as a value of type t, the result is that value. - If val is a number and t is a floating-point type, the result is the
jsdouble
represented by val, cast to type t. (This can implicitly lose bits of precision. The rationale is to allow the user to pass values like 1/3 tofloat
parameters.) - Otherwise fail.
- If t is
ctypes.char16_t
:
- If val is a string of length 1, the result is the 16-bit unsigned value of the code unit in the string.
val.charCodeAt(0)
. - If val is a number that can be exactly represented as a value of type
char16_t
(that is, an integer in the range 0 ≤ val < 216), the result is that value. - Otherwise fail.
- If val is a string of length 1, the result is the 16-bit unsigned value of the code unit in the string.
- If t is any other character type:
- If val is a string:
- If the 16-bit elements of val are not the UTF-16 encoding of a single Unicode character, fail. (Open issue: If we support
wchar_t
we may want to allow unpaired surrogate code points to pass through without error.) - If that Unicode character can be represented by a single character of type t, the result is that character. (Open issue: Unicode conversions.)
- Otherwise fail.
- If the 16-bit elements of val are not the UTF-16 encoding of a single Unicode character, fail. (Open issue: If we support
- If val is a number that can be exactly represented as a value of type t, the result is that value. (This is sensitive to the signedness of t.)
- Otherwise fail.
- If t is a pointer type:
- If val is
null
, the result is a C/C++NULL
pointer of type t. - If val is a
CData
object of array type u and either t isctypes.voidptr_t
orSameType(t.targetType, u.elementType)
, return a pointer to the first element of the array. - If t is
ctypes.voidptr_t
and val is aCData
object of pointer type, return the value of the C/C++ pointer in val, cast tovoid *
. - Otherwise fail. (Rationale: We don't convert strings to pointers yet; see the "Auto-converting strings" section below. We don't convert JavaScript arrays to pointers because this would have to allocate a C array implicitly, raising issues about who should deallocate it, and when, and how they know it's their responsibility.)
- If val is
- If t is an array type:
- If val is a JavaScript string:
- If
t.elementType
ischar16_t
andt.length >= val.length
, the result is an array of type t whose firstval.length
elements are the 16-bit elements of val. Ift.length > val.length
, then elementval.length
of the result is a null character. The values of the rest of the array elements are unspecified. - If
t.elementType
is an 8-bit character type:
- If t is not well-formed UTF-16, fail.
- Let s = a sequence of bytes, the result of converting val from UTF-16 to UTF-8.
- Let n = the number of bytes in s.
- If
t.length < n
, fail. - The result is an array of type t whose first n elements are the 8-bit values in s. If
t.length > n
, then element n of the result is 0. The values of the rest of the array elements are unspecified.
- Otherwise fail.
- If
- If val is a JavaScript array object:
- If
val.length
is not a nonnegative integer, fail. - If
val.length !== t.length
, fail. - Otherwise, the result is a C/C++ array of
val.length
elements of typet.elementType
. Element i of the result isImplicitConvert(val[i], t.elementType)
.
- If
- Otherwise fail. (Rationale: The clause "If val is a JavaScript array object" requires some justification. If we allowed arbitrary JavaScript objects that resemble arrays, that would include CData objects of array type. Consequently,
arr1.value = arr2
wherearr1
is of typectypes.uint8_t.array(30)
andarr2
is of typectypes.int.array(30)
would work as long as the values inarr2
are small enough. We considered this conversion too astonishing and too error-prone.)
- Otherwise t is a struct type.
- If val is a JavaScript object that is not a
CData
object:
- If the enumerable own properties of val are exactly the names of the members of the struct t, the result is a C/C++ struct of type t, each of whose members is
ImplicitConvert(val[the member name], the type of the member)
. - Otherwise fail.
- If the enumerable own properties of val are exactly the names of the members of the struct t, the result is a C/C++ struct of type t, each of whose members is
- Otherwise fail.
- If val is a JavaScript object that is not a
ExplicitConvert(val, t)
- Convert the JavaScript value val to a C/C++ value of type t, a little more forcefully than ImplicitConvert
.
This is called when a JavaScript value is passed as a parameter when calling a type, as in t(val)
or new t(val)
.
- If
ImplicitConvert(val, t)
succeeds, use that result. Otherwise:
- If t is
ctypes.bool
, the result is the C/C++ boolean value corresponding toToBoolean(val)
, where the operatorToBoolean
is as defined in the ECMAScript standard. (This is a bit less strict than the conversion behavior specified for numeric types below. This is just for convenience: the operators&&
and||
, which produce a boolean value in C/C++, do not always do so in JavaScript.)
- If t is an integer or character type and val is an infinity or NaN, the result is a 0 of type t.
- If t is an integer or character type and val is a finite number, the result is the same as casting the
jsdouble
value of val to type t with a C-style cast. (I think this basically means, start with val, discard the fractional part, convert the integer part to a bit-pattern, and mask off whatever doesn't fit in type t. But whatever C does is good enough for me. --jorendorff)
- If t is an integer or character type and val is an
Int64
orUInt64
object, the result is the same as casting theint64_t
oruint64_t
value of val to type t with a C-style cast.
- If t is a pointer type and val is a number,
Int64
object, orUInt64
object that can be exactly represented as anintptr_t
oruintptr_t
, the result is the same as casting thatintptr_t
oruintptr_t
value to type t with a C-style cast.
- If t is an integer type (not a character type) and val is a string consisting entirely of an optional minus sign, followed by either one or more decimal digits or the characters "0x" or "0X" and one or more hexadecimal digits, then the result is the same as casting the integer named by val to type t with a C-style cast.
- Otherwise fail.
SameType(t, u)
- True if t and u represent the same C/C++ type.
- If t and u represent the same built-in type, even
void
, return true. - If they are both pointer types, return
SameType(t.targetType, u.targetType)
. - If they are both array types, return
SameType(t.elementType, u.elementType) && t.length === u.length
. - If they are both struct types, return
t === u
. - Otherwise return false.
(SameType(int, int32_t)
is false. Rationale: As it stands, SameType
behaves the same on all platforms. By making types match if they are typedef'd on the current platform, we could make e.g. ctypes.int.ptr
and ctypes.int32_t.ptr
compatible on platforms where we just have typedef int int32_t
. But it was unclear how much that would matter in practice, balanced against cross-platform consistency. We might reverse this decision.)
Examples
Cu.import("ctypes"); // imports the global ctypes object // searches the path and opens "libmylib.so" on linux, // "libmylib.dylib" on mac, and "mylib.dll" on windows let mylib = ctypes.open("mylib", ctypes.SEARCH); // declares the C function: // int32_t myfunc(int32_t); let myfunc = mylib.declare("myfunc", ctypes.default_abi, ctypes.int32_t, ctypes.int32_t); let ret = myfunc(2); // calls myfunc
Note that for simple types (integers and characters), we will autoconvert the argument at call time - there's no need to pass in a ctypes.int32_t
object. The consumer should never need to instantiate such an object explicitly, unless they're using it to back a pointer - in which case we require explicit, strong typing. See later for examples.
Here is how to create an object of type int32_t
:
let i = new ctypes.int32_t; // new int32_t object with default value 0
This allocates a new C++ object of type int32_t
(4 bytes of memory), zeroes it out, and returns a JS object that manages the allocated memory. Whenever the JS object is garbage-collected, the allocated memory will be automatically freed.
Of course you don't normally need to do this, as js-ctypes will autoconvert JS numbers to various C/C++ types for you:
let myfunc = mylib.declare("myfunc", ctypes.default_abi, ctypes.int32_t, ctypes.int32_t); let ret = myfunc(i); print(typeof ret); // The result is a JavaScript number. number
ctypes.int32_t
is a CType
. Like all other CTypes, it can be used for type specification when passed as an object, as above. (This will work for user-defined CTypes
such as structs and pointers also - see later.)
The object created by new ctypes.int32_t
is called a CData
object, and they are described in detail in the "CData
objects" section above.
Opaque pointers:
// A new opaque pointer type. FILE_ptr = new ctypes.StructType("FILE").ptr; let fopen = mylib.declare("fopen", ctypes.default_abi, FILE_ptr, ctypes.char.ptr, ctypes.char.ptr); let file = fopen("foo", "r"); if (file.isNull()) throw "fopen failed"; file.contents(); // TypeError: type is unknown
(Open issue: fopen("foo", "r")
does not work under js-ctypes as currently specified.)
Declaring a struct:
// C prototype: struct s_t { int32_t a; int64_t b; }; const s_t = new ctypes.StructType("s_t", [{ a: Int32 }, { b: Int64 }]); let myfunc = mylib.declare("myfunc", ctypes.default_abi, ctypes.int32_t, s_t); let s = new s_t(10, 20);
This creates an s_t object which allocates enough memory for the whole struct, creates getters and setters to access the binary fields via their offset, and assigns the values 10 and 20 to the fields. The new object's prototype is s_t.prototype
.
let i = myfunc(0, s); // checks the type of s
Nested structs:
const u_t = ctypes.StructType("u_t", [{ x: Int64 }, { y: s_t }]); let u = new u_t(5e4, s); // copies data from s into u.y - no references let u_field = u.y; // creates an s_t object that points directly to // the offset of u.y within u.
An out parameter:
// allocate sizeof(uint32_t)==4 bytes, // initialize to 5, and return a new CData object let i = new ctypes.uint32_t(5); // Declare a C function with an out parameter. const getint = ctypes.declare("getint", ctypes.abi.default, ctypes.void_t, ctypes.uint32_t.ptr); getint(i.address()); // explicitly take the address of allocated buffer
(Python ctypes has byref(i)
as an alternative to i.address()
, but we do not expect users to do the equivalent of from ctypes import *
, and setint(ctypes.byref(i))
is a bit much.)
Pointers:
// Declare a C function that returns a pointer. const getintp = ctypes.declare("getintp", ctypes.abi.default, ctypes.uint32_t.ptr); let p = getintp(); // A CData object that holds the returned uint32_t * // cast from (uint32_t *) to (uint8_t *) let q = ctypes.cast(p, ctypes.uint8_t.ptr); // first byte of buffer let b0 = q.contents(); // an integer, 0 <= b0 < 256
Struct fields:
const u_t = new ctypes.StructType('u_t', [[ctypes.uint32_t, 'x'], [ctypes.uint32_t, 'y']]); // allocates sizeof(2*uint32_t) and creates a CData object let u = new u_t(5, 10); u.x = 7; // setter for u.x modifies field let i = u.y; // getter for u.y returns ConvertToJS(reference to u.y) print(i); // ...which is the primitive number 10 10 i = 5; // doesn't touch u.y print(u.y); 10 const v_t = new ctypes.StructType('v_t', [[u_t, 'u'], [ctypes.uint32_t, 'z']]); // allocates 12 bytes, zeroes them out, and creates a CData object let v = new v_t; let w = v.u; // ConvertToJS(reference to v.u) returns CData object w.x = 3; // invokes setter setint(v.u.x); // TypeError: setint argument 1 expects type uint32_t *, got int let p = v.u.addressOfField('x'); // pointer to v.u.x setint(p); // ok - manually pass address
64-bit integers:
// Declare a function that returns a 64-bit unsigned int. const getfilesize = mylib.declare("getfilesize", ctypes.default_abi, ctypes.uint64_t, ctypes.char.ptr); // This autoconverts to a UInt64 object, not a JS number, even though the // file is presumably much smaller than 4GiB. Converting to a different type // each time you call the function, depending on the result value, would be // worse. let s = getfilesize("/usr/share/dict/words"); print(s instanceof ctypes.UInt64); true print(s < 1000000); // Because s is an object, not a number, false // JS lies to you. print(s >= 1000000); // Neither of these is doing what you want, false // as evidenced by the bizarre answers. print(s); // It has a nice .toString() method at least! 931467 // There is no shortcut. To get an actual JS number out of a // 64-bit integer, you have to use the ctypes.{Int64,UInt64}.{hi,lo} // functions. print(ctypes.UInt64.lo(s)) 931467 // (OK, I lied. There is a shortcut. You can abuse the .toString() method. // WARNING: This can lose precision!) print(Number(s.toString())) 931467 let i = new ctypes.int64_t(5); // a new 8-byte buffer let j = i; // another variable referring to the same CData object j.value = 6; // invokes setter on i, auto-promotes 6 to Int64 print(typeof j.value) // but j.value is still an Int64 object object print(j.value instanceof ctypes.Int64) true print(j.value); 6 const m_t = new ctypes.StructType( 'm_t', [[ctypes.int64_t, 'x'], [ctypes.int64_t, 'y']]); let m = new m_t; const getint64 = ctypes.declare("getint64", ctypes.abi.default, ctypes.void_t, ctypes.Pointer(ctypes.int64_t)); getint64(m.x); // TypeError: getint64 argument 1 expected type int64_t *, // got Int64 object // (because m.x's getter autoconverts to an Int64 object) getint64(ctypes.addressOfField(m, 'x')); // works
(Open issue: As above, the implicit conversion from JS string to char *
in getfilesize("/usr/share/dict/words")
does not work in js-ctypes as specified.)
(TODO - make this a real example:)
let i1 = ctypes.int32_t(5); let i2 = ctypes.int32_t(); i2.value = i1 // i2 and i1 have separate binary storage, this is memcpy //you can copy the guts of one struct to another, etc.
Future directions
Callbacks
The libffi part of this is presumably not too bad. Issues:
Lifetimes. C/C++ makes it impossible to track an object pointer. Both JavaScript's GC and experience with C/C++ function pointers will tend to discourage users from caring about function lifetimes.
I think the best solution to this problem is to put the burden of keeping the function alive entirely on the client.
Finding the right context to use. If we burn the cx right into the libffi closure, it will crash when called from a different thread or after the cx is destroyed. If we take a context at random from some internal JSAPI structure, it might be thread-safe, but the context's options and global will be random, which sounds dangerous. Perhaps ctypes itself can create a context per thread, on demand, for the use of function pointers. In a typical application, that would only create one context, if any.
Converting strings
I think we want an explicit API for converting strings, very roughly:
CData
objects of certain pointer and array types have methods for reading and writing Unicode strings. These methods are present if the target or element type is an 8-bit character or integer type.
cdata.readString([encoding[, length]])
- Read bytes from cdata and convert them to Unicode characters using the specified encoding, returning a string. Specifically:
- If cdata is an array, let p = a pointer to the first element. Otherwise cdata is a pointer; let p = the value of cdata.
- If encoding is
undefined
or omitted, the selected encoding is UTF-8. Otherwise, if encoding is a string naming a known character encoding, that encoding is selected. Otherwise throw aTypeError
. - If length is a size value, cdata is an array, and
length > cdata.length
, then throw aTypeError
. - Otherwise, if length is a size value, take exactly length bytes starting at p and convert them to Unicode characters according to the selected encoding. (Open issue: Error handling.) Return a JavaScript string containing the Unicode characters, represented in UTF-16. (The result may contain null characters.)
- Otherwise, if length is
undefined
or omitted, convert bytes starting at p to Unicode characters according to the selected encoding. Stop when the end of the array is reached (if cdata is an array) or when a null character (U+0000) is found. (Open issue: Error handling.) Return a JavaScript string containing the Unicode characters, represented in UTF-16. (If cdata is a pointer and there is no trailing null character, this can crash.) - Otherwise throw a
TypeError
.
cdata.writeString(s, [encoding[, length]])
- Determine the starting pointer p as above. If s is not a well-formed UTF-16 string, throw a TypeError
. (Open issue: Error handling.) Otherwise convert s to bytes in the specified encoding (default: UTF-8) and write at most length - 1 bytes, or all the converted bytes, if length is undefined
or omitted, to memory starting at p. Write a converted null character after the data. Return the number of bytes of data written, not counting the terminating null character.
(Open issue: cdata.writeString(...)
is awkward for the case where you want an autosized ctypes.char.array()
to hold the converted data. If cdata happens to be too small for the resulting string, and you don't supply length, you crash; and if you do supply length, you don't know whether conversion was halted because the target array was of insufficient length.)
(Open issue: As proposed, these are not suitable for working with encodings where a zero byte might not indicate the end of text. For example, a string encoded in UTF-16 will typically contain a lot of zero bytes. Unfortunately, in the case of readString, the underlying library demands the length up front.)
(Open issue: These methods offer no error handling options, which is pretty weak. Real-world code often wants to allow a few characters to be garbled rather than fail. For now we will likely be limited to whatever the underlying codec library, nsIScriptableUnicodeConverter
, can do.)
(Open issue: 16-bit versions too, for UTF-16?)
isNull
If we do not convert NULL pointers to JS null
(and I may have changed my mind about this) then we need:
cptr.isNull()
- Return true
if cptr's value is a null pointer, false
otherwise.
Auto-converting strings
There are several issues:
Lifetimes. This problem arises when autoconverting from JS to C/C++ only.
When passing a string to a foreign function, like foo(s)
, what is the lifetime of the autoconverted pointer? We're comfortable with guaranteeing s
for the duration of the call. But then there are situations like
TenStrings = char.ptr.array(10); var arr = new TenStrings(); arr[0] = s; // What is the lifetime of the data arr[0] points to?
The more implicit conversion we allow, the greater a problem this is; it's a tough trade-off.
Non-null-terminated strings. This problem arises when autoconverting from C/C++ to JS only. It applies to C/C++ character arrays as well as pointers (but it's worse when dealing with pointers).
In C/C++, the type char *
effectively promises nothing about the pointed-to data. Autoconverting would make it hard to use APIs that return non-null-terminated strings (or structs containing char *
pointers that aren't logically strings). The workaround would be to declare them as a different type.
Unicode. This problem does not apply to conversions between JS strings and char16_t
arrays or pointers; only char
arrays or pointers.
Converting both ways raises issues about what encoding should be assumed. We assume JS strings are UTF-16 and char
strings are UTF-8, which is not the right thing on Windows. However Windows offers a lot of APIs that accept 16-bit strings and, for those, char16_t
is the right thing.
Casting away const. This problem arises only when converting from a JS string to a C/C++ pointer type. The string data must not be modified, but the C/C++ types char *
and char16_t *
suggest that the referent might be modified.