1 Erl_Interface User's Guide

1.1 Introduction

The Erl_Interface library contains functions that help you integrate programs written in C and Erlang. The functions in Erl_Interface support the following:

Manipulation of data represented as Erlang data types
Conversion of data between C and Erlang formats
Encoding and decoding of Erlang data types for transmission or storage
Communication between C nodes and Erlang processes
Backup and restore of C node state to and from Mnesia

Note

By default, the Erl_Interface libraries are only guaranteed to be compatible with other Erlang/OTP components from the same release as the libraries themselves. For information about how to communicate with Erlang/OTP components from earlier releases, see function ei:ei_set_compat_rel and erl_eterm:erl_set_compat_rel.

Scope

In the following sections, these topics are described:

Compiling your code for use with Erl_Interface
Initializing Erl_Interface
Encoding, decoding, and sending Erlang terms
Building terms and patterns
Pattern matching
Connecting to a distributed Erlang node
Using the Erlang Port Mapper Daemon (EPMD)
Sending and receiving Erlang messages
Remote procedure calls
Using global names
Using the registry

Prerequisites

It is assumed that the reader is familiar with the Erlang programming language.

1.2 Compiling and Linking Your Code

To use any of the Erl_Interface functions, include the following lines in your code:

#include "erl_interface.h"
#include "ei.h"

Determine where the top directory of your OTP installation is. To find this, start Erlang and enter the following command at the Eshell prompt:

Eshell V4.7.4  (abort with ^G)
1> code:root_dir().
/usr/local/otp

To compile your code, ensure that your C compiler knows where to find erl_interface.h by specifying an appropriate -I argument on the command line, or add it to the CFLAGS definition in your Makefile. The correct value for this path is $OTPROOT/lib/erl_interface-$EIVSN/include, where:

$OTPROOT is the path reported by code:root_dir/0 in the example above.
$EIVSN is the version of the Erl_Interface application, for example, erl_interface-3.2.3.

Compiling the code:

$ cc -c -I/usr/local/otp/lib/erl_interface-3.2.3/include myprog.c

When linking:

Specify the path to liberl_interface.a and libei.a with -L$OTPROOT/lib/erl_interface-3.2.3/lib.
Specify the name of the libraries with -lerl_interface -lei.

Do this on the command line or add the flags to the LDFLAGS definition in your Makefile.

Linking the code:

$ ld -L/usr/local/otp/lib/erl_interface-3.2.3/
                            lib myprog.o -lerl_interface -lei -o myprog

On some systems it can be necessary to link with some more libraries (for example, libnsl.a and libsocket.a on Solaris, or wsock32.lib on Windows) to use the communication facilities of Erl_Interface.

If you use the Erl_Interface functions in a threaded application based on POSIX threads or Solaris threads, then Erl_Interface needs access to some of the synchronization facilities in your threads package. You must specify extra compiler flags to indicate which of the packages you use. Define _REENTRANT and either STHREADS or PTHREADS. The default is to use POSIX threads if _REENTRANT is specified.

1.3 Initializing the Erl_Interface Library

Before calling any of the other Erl_Interface functions, call erl_init() exactly once to initialize the library. erl_init() takes two arguments. However, the arguments are no longer used by Erl_Interface and are therefore to be specified as erl_init(NULL,0).

1.4 Encoding, Decoding, and Sending Erlang Terms

Data sent between distributed Erlang nodes is encoded in the Erlang external format. You must therefore encode and decode Erlang terms into byte streams if you want to use the distribution protocol to communicate between a C program and Erlang.

The Erl_Interface library supports this activity. It has several C functions that create and manipulate Erlang data structures. The library also contains an encode and a decode function. The following example shows how to create and encode an Erlang tuple {tobbe,3928}:

ETERM *arr[2], *tuple;
char buf[BUFSIZ];
int i;
  
arr[0] = erl_mk_atom("tobbe");
arr[1] = erl_mk_integer(3928);
tuple  = erl_mk_tuple(arr, 2);
i = erl_encode(tuple, buf);

Alternatively, you can use erl_send() and erl_receive_msg, which handle the encoding and decoding of messages transparently.

For a complete description, see the following modules:

erl_eterm for creating Erlang terms
erl_marshal for encoding and decoding routines

1.5 Building Terms and Patterns

The previous example can be simplified by using the erl_format module to create an Erlang term:

ETERM *ep;
ep = erl_format("{~a,~i}", "tobbe", 3928);

For a complete description of the different format directives, see the erl_format module.

The following example is more complex:

ETERM *ep;
ep = erl_format("[{name,~a},{age,~i},{data,~w}]",
                 "madonna", 
                 21, 
                 erl_format("[{adr,~s,~i}]", "E-street", 42));
erl_free_compound(ep);

As in the previous examples, it is your responsibility to free the memory allocated for Erlang terms. In this example, erl_free_compound() ensures that the complete term pointed to by ep is released. This is necessary because the pointer from the second call to erl_format is lost.

The following example shows a slightly different solution:

ETERM *ep,*ep2;
ep2 = erl_format("[{adr,~s,~i}]","E-street",42);
ep  = erl_format("[{name,~a},{age,~i},{data,~w}]",
                 "madonna", 21, ep2);
erl_free_term(ep);  
erl_free_term(ep2);

In this case, you free the two terms independently. The order in which you free the terms ep and ep2 is not important, because the Erl_Interface library uses reference counting to determine when it is safe to remove objects.

If you are unsure whether you have freed the terms properly, you can use the following function to see the status of the fixed term allocator:

long allocated, freed;

erl_eterm_statistics(&allocated,&freed);
printf("currently allocated blocks: %ld\n",allocated);
printf("length of freelist: %ld\n",freed);

/* really free the freelist */
erl_eterm_release();

For more information, see the erl_malloc module.

1.6 Pattern Matching

An Erlang pattern is a term that can contain unbound variables or "do not care" symbols. Such a pattern can be matched against a term and, if the match is successful, any unbound variables in the pattern will be bound as a side effect. The content of a bound variable can then be retrieved:

ETERM *pattern;
pattern = erl_format("{madonna,Age,_}");

The erl_format:erl_match function performs pattern matching. It takes a pattern and a term and tries to match them. As a side effect any unbound variables in the pattern will be bound. In the following example, a pattern is created with a variable Age, which is included at two positions in the tuple. The pattern match is performed as follows:

erl_match binds the contents of Age to 21 the first time it reaches the variable.
The second occurrence of Age causes a test for equality between the terms, as Age is already bound to 21. As Age is bound to 21, the equality test succeeds and the match continues until the end of the pattern.
If the end of the pattern is reached, the match succeeds and you can retrieve the contents of the variable.

ETERM *pattern,*term;
pattern = erl_format("{madonna,Age,Age}");
term    = erl_format("{madonna,21,21}");
if (erl_match(pattern, term)) {
  fprintf(stderr, "Yes, they matched: Age = ");
  ep = erl_var_content(pattern, "Age"); 
  erl_print_term(stderr, ep);
  fprintf(stderr,"\n");
  erl_free_term(ep);
}
erl_free_term(pattern);
erl_free_term(term);

For more information, see the erl_format:erl_match function.

1.7 Connecting to a Distributed Erlang Node

To connect to a distributed Erlang node, you must first initialize the connection routine with erl_connect:erl_connect_init, which stores information, such as the hostname, node name, and IP address for later use:

int identification_number = 99;
int creation=1;
char *cookie="a secret cookie string"; /* An example */
erl_connect_init(identification_number, cookie, creation);

For more information, see the erl_connect module.

After initialization, you set up the connection to the Erlang node. To specify the Erlang node you want to connect to, use erl_connect(). The following example sets up the connection and is to result in a valid socket file descriptor:

int sockfd;
char *nodename="[email protected]"; /* An example */
if ((sockfd = erl_connect(nodename)) < 0)
  erl_err_quit("ERROR: erl_connect failed");

erl_err_quit() prints the specified string and terminates the program. For more information, see the erl_error module.

1.8 Using EPMD

erts:epmd is the Erlang Port Mapper Daemon. Distributed Erlang nodes register with epmd on the local host to indicate to other nodes that they exist and can accept connections. epmd maintains a register of node and port number information, and when a node wishes to connect to another node, it first contacts epmd to find the correct port number to connect to.

When you use erl_connect to connect to an Erlang node, a connection is first made to epmd and, if the node is known, a connection is then made to the Erlang node.

C nodes can also register themselves with epmd if they want other nodes in the system to be able to find and connect to them.

Before registering with epmd, you must first create a listen socket and bind it to a port. Then:

int pub;

pub = erl_publish(port);

pub is a file descriptor now connected to epmd. epmd monitors the other end of the connection. If it detects that the connection has been closed, the node becomes unregistered. So, if you explicitly close the descriptor or if your node fails, it becomes unregistered from epmd.

Notice that on some systems (such as VxWorks), a failed node is not detected by this mechanism, as the operating system does not automatically close descriptors that were left open when the node failed. If a node has failed in this way, epmd prevents you from registering a new node with the old name, as it thinks that the old name is still in use. In this case, you must unregister the name explicitly:

erl_unpublish(node);

This causes epmd to close the connection from the far end. Notice that if the name was in fact still in use by a node, the results of this operation are unpredictable. Also, doing this does not cause the local end of the connection to close, so resources can be consumed.

1.9 Sending and Receiving Erlang Messages

Use one of the following two functions to send messages:

As in Erlang, messages can be sent to a pid or to a registered name. It is easier to send a message to a registered name, as it avoids the problem of finding a suitable pid.

Use one of the following two functions to receive messages:

erl_receive() receives the message into a buffer, while erl_receive_msg() decodes the message into an Erlang term.

Example of Sending Messages

In the following example, {Pid, hello_world} is sent to a registered process my_server. The message is encoded by erl_send():

extern const char *erl_thisnodename(void);
extern short erl_thiscreation(void);
#define SELF(fd) erl_mk_pid(erl_thisnodename(),fd,0,erl_thiscreation())
ETERM *arr[2], *emsg;
int sockfd, creation=1;
  
arr[0] = SELF(sockfd);
arr[1] = erl_mk_atom("Hello world");
emsg   = erl_mk_tuple(arr, 2);
  
erl_reg_send(sockfd, "my_server", emsg);
erl_free_term(emsg);

The first element of the tuple that is sent is your own pid. This enables my_server to reply. For more information about the primitives, see the erl_connect module.

Example of Receiving Messages

In this example, {Pid, Something} is received. The received pid is then used to return {goodbye,Pid}.

ETERM *arr[2], *answer;
int sockfd,rc;
char buf[BUFSIZE];
ErlMessage emsg;
  
if ((rc = erl_receive_msg(sockfd , buf, BUFSIZE, &emsg)) == ERL_MSG) {
   arr[0] = erl_mk_atom("goodbye");
   arr[1] = erl_element(1, emsg.msg); 
   answer = erl_mk_tuple(arr, 2);
   erl_send(sockfd, arr[1], answer);
   erl_free_term(answer);
   erl_free_term(emsg.msg);
   erl_free_term(emsg.to);
}

To provide robustness, a distributed Erlang node occasionally polls all its connected neighbors in an attempt to detect failed nodes or communication links. A node that receives such a message is expected to respond immediately with an ERL_TICK message. This is done automatically by erl_receive(). However, when this has occurred, erl_receive returns ERL_TICK to the caller without storing a message into the ErlMessage structure.

When a message has been received, it is the caller's responsibility to free the received message emsg.msg and emsg.to or emsg.from, depending on the type of message received.

For more information, see the erl_connect and erl_eterm modules.

1.10 Remote Procedure Calls

An Erlang node acting as a client to another Erlang node typically sends a request and waits for a reply. Such a request is included in a function call at a remote node and is called a remote procedure call.

The following example shows how the Erl_Interface library supports remote procedure calls:

char modname[]=THE_MODNAME;
ETERM *reply,*ep;
ep = erl_format("[~a,[]]", modname);
if (!(reply = erl_rpc(fd, "c", "c", ep)))
  erl_err_msg("<ERROR> when compiling file: %s.erl !\n", modname);
erl_free_term(ep);
ep = erl_format("{ok,_}");
if (!erl_match(ep, reply))
  erl_err_msg("<ERROR> compiler errors !\n");
erl_free_term(ep);
erl_free_term(reply);

c:c/1 is called to compile the specified module on the remote node. erl_match() checks that the compilation was successful by testing for the expected ok.

For more information about erl_rpc() and its companions erl_rpc_to() and erl_rpc_from(), see the erl_connect module.

1.11 Using Global Names

A C node has access to names registered through the global module in Kernel. Names can be looked up, allowing the C node to send messages to named Erlang services. C nodes can also register global names, allowing them to provide named services to Erlang processes or other C nodes.

Erl_Interface does not provide a native implementation of the global service. Instead it uses the global services provided by a "nearby" Erlang node. To use the services described in this section, it is necessary to first open a connection to an Erlang node.

To see what names there are:

char **names;
int count;
int i;

names = erl_global_names(fd,&count);

if (names) 
  for (i=0; i<count; i++) 
    printf("%s\n",names[i]);

free(names);

erl_global:erl_global_names allocates and returns a buffer containing all the names known to the global module in Kernel. count is initialized to indicate the number of names in the array. The array of strings in names is terminated by a NULL pointer, so it is not necessary to use count to determine when the last name is reached.

It is the caller's responsibility to free the array. erl_global_names allocates the array and all the strings using a single call to malloc(), so free(names) is all that is necessary.

To look up one of the names:

ETERM *pid;
char node[256];

pid = erl_global_whereis(fd,"schedule",node);

If "schedule" is known to the global module in Kernel, an Erlang pid is returned that can be used to send messages to the schedule service. Also, node is initialized to contain the name of the node where the service is registered, so that you can make a connection to it by simply passing the variable to erl_connect.

Before registering a name, you should already have registered your port number with epmd. This is not strictly necessary, but if you neglect to do so, then other nodes wishing to communicate with your service cannot find or connect to your process.

Create a pid that Erlang processes can use to communicate with your service:

ETERM *pid;

pid = erl_mk_pid(thisnode,14,0,0);
erl_global_register(fd,servicename,pid);

After registering the name, use erl_connect:erl_accept to wait for incoming connections.

Note

Remember to free pid later with erl_malloc:erl_free_term.

To unregister a name:

erl_global_unregister(fd,servicename);

1.12 Using the Registry

This section describes the use of the registry, a simple mechanism for storing key-value pairs in a C-node, as well as backing them up or restoring them from an Mnesia table on an Erlang node. For more detailed information about the individual API functions, see the registry module.

Keys are strings, that is, NULL-terminated arrays of characters, and values are arbitrary objects. Although integers and floating point numbers are treated specially by the registry, you can store strings or binary objects of any type as pointers.

To start, open a registry:

ei_reg *reg;

reg = ei_reg_open(45);

The number 45 in the example indicates the approximate number of objects that you expect to store in the registry. Internally the registry uses hash tables with collision chaining, so there is no absolute upper limit on the number of objects that the registry can contain, but if performance or memory usage is important, then you are to choose a number accordingly. The registry can be resized later.

You can open as many registries as you like (if memory permits).

Objects are stored and retrieved through set and get functions. The following example shows how to store integers, floats, strings, and arbitrary binary objects:

struct bonk *b = malloc(sizeof(*b));
char *name = malloc(7);

ei_reg_setival(reg,"age",29); 
ei_reg_setfval(reg,"height",1.85);

strcpy(name,"Martin");
ei_reg_setsval(reg,"name",name); 

b->l = 42;
b->m = 12;
ei_reg_setpval(reg,"jox",b,sizeof(*b));

If you try to store an object in the registry and there is an existing object with the same key, the new value replaces the old one. This is done regardless of whether the new object and the old one have the same type, so you can, for example, replace a string with an integer. If the existing value is a string or binary, it is freed before the new value is assigned.

Stored values are retrieved from the registry as follows:

long i;
double f;
char *s;
struct bonk *b;
int size;

i = ei_reg_getival(reg,"age");
f = ei_reg_getfval(reg,"height");
s = ei_reg_getsval(reg,"name");
b = ei_reg_getpval(reg,"jox",&size);

In all the above examples, the object must exist and it must be of the right type for the specified operation. If you do not know the type of an object, you can ask:

struct ei_reg_stat buf;

ei_reg_stat(reg,"name",&buf);

Buf is initialized to contain object attributes.

Objects can be removed from the registry:

ei_reg_delete(reg,"name");

When you are finished with a registry, close it to remove all the objects and free the memory back to the system:

ei_reg_close(reg);

Backing Up the Registry to Mnesia

The contents of a registry can be backed up to Mnesia on a "nearby" Erlang node. You must provide an open connection to the Erlang node (see erl_connect). Also, Mnesia 3.0 or later must be running on the Erlang node before the backup is initiated:

ei_reg_dump(fd, reg, "mtab", dumpflags);

This example back up the contents of the registry to the specified Mnesia table "mtab". Once a registry has been backed up to Mnesia like this, more backups only affect objects that have been modified since the most recent backup, that is, objects that have been created, changed, or deleted. The backup operation is done as a single atomic transaction, so that either the entire backup is performed or none of it.

Likewise, a registry can be restored from a Mnesia table:

ei_reg_restore(fd, reg, "mtab");

This reads the entire contents of "mtab" into the specified registry. After the restore, all the objects in the registry are marked as unmodified, so a later backup only affects objects that you have modified since the restore.

Notice that if you restore to a non-empty registry, objects in the table overwrite objects in the registry with the same keys. Also, the entire contents of the registry is marked as unmodified after the restore, including any modified objects that were not overwritten by the restore operation. This may not be your intention.

Storing Strings and Binaries

When string or binary objects are stored in the registry it is important that some simple guidelines are followed.

Most importantly, the object must have been created with a single call to malloc() (or similar), so that it can later be removed by a single call to free(). Objects are freed by the registry when it is closed, or when you assign a new value to an object that previously contained a string or binary.

Notice that if you store binary objects that are context-dependent (for example, containing pointers or open file descriptors), they lose their meaning if they are backed up to a Mnesia table and later restored in a different context.

When you retrieve a stored string or binary value from the registry, the registry maintains a pointer to the object and you are passed a copy of that pointer. You should never free an object retrieved in this manner because when the registry later attempts to free it, a runtime error occurs that likely causes the C-node to crash.

You are free to modify the contents of an object retrieved this way. However, when you do so, the registry is not aware of your changes, possibly causing it to be missed the next time you make an Mnesia backup of the registry contents. This can be avoided if you mark the object as dirty after any such changes with registry:ei_reg_markdirty, or pass appropriate flags to registry:ei_reg_dump.