writing mysql udfs
TRANSCRIPT
Writing MySQL User-Defined Functions
Welcome!
@rolandbouman
http://rpbouman.blogspot.com/
Ex-MySQL AB, Ex-Sun Microsystems
Currently at http://www.pentaho.com/
Different kinds of MySQL Functions
SQL Stored functionsSQL/PSM code stored in the database
User-defined functions (UDF's)Native code stored in an external binary library
Built-in functions (and operators)Native code part of the server binaries
CREATE FUNCTION lightspeed()RETURNS INT UNSIGNEDBEGIN RETURN 299792458;END;
CREATE FUNCTION lightspeedRETURNS INTEGER SONAME 'lightspeed.so';
MySQL Function Kinds Comparison
Built-inUDFStored routine
languageC/C++C/C++*SQL/PSM
executionnatively compilednatively compiledinterpreted
performancevery goodgoodnot that good
user-definednoyesyes
installationNACREATE stmtCREATE stmt
deploymentNAbinary libraryschema object
namespaceglobalglobalschema
grant-ablenonoyes
SQL capabilitiesnono*yes
Data typesinternalinternal and C/C++SQL data types
Charset supportyes*noyes
argumentlistflexible*flexiblefixed number & type
aggregatesyes*yesno
Quick and dirty example (1/2)
Compile to a shared library:
Write C/C++ code:
char lightspeed_init(){ return 0; }long long lightspeed(){ return 299792458; }
shell$ gcc -shared -o lightspeed.so lightspeed.c
Move library to plugin dir:
mysql> SHOW VARIABLES LIKE 'plugin_dir';+---------------+------------------------------+| Variable_name | Value |+---------------+------------------------------+| plugin_dir | /home/mysql/mysql/lib/plugin |+---------------+------------------------------+
shell$ mv lightspeed.so /home/mysql/mysql/lib/plugin
Quick and dirty example (2/2)
Register the function in MySQL
Verify:
mysql> SELECT * FROM mysql.func;+------------+-----+---------------+----------+| name | ret | dl | type |+------------+-----+---------------+----------+| lightspeed | 2 | lightspeed.so | function |+------------+-----+---------------+----------+
mysql> CREATE FUNCTION lightspeed -> RETURNS INTEGER SONAME 'lightspeed.so';
Run:
mysql> SELECT lightspeed();+--------- ----+| lightspeed() |+--------------+| 299792458 |+--------------+
UDF implementation functions
General UDF Implementation functionsInit function
Row-level function
De-init function
xxx_init()xxx()xxx_deinit()row?
UDF implementation functions
Aggregate UDF Implementation functionsclear function
add function
xxx_init()xxx_clear()xxx_deinit()start group?
end group?xxx_add()
xxx()
The init function
Signature:
my_bool xxx_init( UDF_INIT *initid, UDF_ARGS *args, char *message);
Set default propertiesof the return value
of the arguments
Validate arguments
Allocate resources
Signal an errorReturn 1 to signal an error
Copy error message to *message
The de-init function
Signature:
void xxx_deinit(UDF_INIT *initid);
De-allocate any resources
The row-level function
Returns the value to the callerINTEGER type (SQL: bigint):
REAL type (SQL: double):
long long xxx( UDF_INIT *initid, UDF_ARGS *args, my_bool *is_null, my_bool *error );
double xxx( UDF_INIT *initid, UDF_ARGS *args, my_bool *is_null, my_bool *error );
The row-level function
Returns the value to the callerSTRING type (SQL: varchar, text):
DECIMAL type (SQL: decimal):Just like STRING type
char *xxx( UDF_INIT *initid, UDF_ARGS *args, char* result, unsigned long length, my_bool *is_null, my_bool *error );
Type systems
SQL data typeRETURNSItem_resultC type
Character strings (CHAR, VARCHAR)
Text (TEXT, LONGTEXT, MEDIUMTEXT, TINYTEXT)Binary strings (BINARY,
VARBINARY)
Blob (BLOB, LONGBLOB, MEDIUMBLOB, TINYBLOB)
Temporal (DATE, TIME, DATETIME, TIMESTAMP)
Structured (ENUM, SET)STRINGSTRING_RESULTchar*
Floating point numbers (FLOAT, DOUBLE)REALREAL_RESULTdouble
Ints (BIGINT, MEDIUMINT, SMALLINT, TINYINT)INTEGERINT_RESULTlong long
NANAROW_RESULTNA
Fixed point numbers (DECIMAL)DECIMALDECIMAL_RESULTchar *
Data Structures
#include "mysql.h"include dir beneath MySQL home dir
Actually defined in mysql_com.h
UDF_INIT *initid
UDF_ARGS *args
Data Structures: UDF_INIT
Passed to all UDF implementation functions
typedef struct st_udf_init { my_bool maybe_null; /* 1 if function can return NULL */ unsigned int decimals; /* for real functions */ unsigned long max_length; /* For string functions */ char *ptr; /* free pointer for function data */ my_bool const_item; /* 1 if always returns the same value*/ void *extension;} UDF_INIT;
Conveys mainly info about return value
char *ptr explicitly meant to hold state:Resource allocation
Can always be read
Can be written in xxx_init()
Writing to UDF_INIT (1/3)
1st attempt at writing a REAL function:
char planck_init(){ return 0; }double planck(){ return 6.62606957e-34; }
mysql> CREATE FUNCTION planck -> RETURNS REAL SONAME 'planck.so';
mysql> SELECT planck();+----------+| planck() |+----------+| 0. |+----------+
But is planck() really 0?
Writing to UDF_INIT (2/3)
2nd attempt at writing a REAL function:
#include "mysql.h"
my_bool planck_init( UDF_INIT *initid, UDF_ARGS *args, char *msg){ initid->decimals = NOT_FIXED_DEC; return 0;}double planck(){return 6.62606957e-34;}
mysql> SELECT planck();+----------------+| planck() |+----------------+| 6.62606957e-34 |+----------------+
Writing to UDF_INIT (3/3)
More return value properties:
#include "mysql.h"
my_bool planck_init( UDF_INIT *initid, UDF_ARGS *args, char *msg){ initid->decimals = NOT_FIXED_DEC; initid->maybe_null = 0; initid->const_item = 1; return 0;}double planck(){return 6.62606957e-34;}
Data Structures: UDF_ARGS
Passed to init and row-level function
typedef struct st_udf_args{ unsigned int arg_count; /* Number of arguments */ enum Item_result *arg_type; /* Pointer to item_results */ char **args; /* Pointer to value */ unsigned long *lengths; /* Length of string arguments */ char *maybe_null; /* 1 for all maybe_null args */ char **attributes; /* Pointer to attribute name */ unsigned long *attribute_lengths;/* Length of attribute */ void *extension;} UDF_ARGS;
Conveys information about function arguments
Can always be read
Can be written in xxx_init()
Using UDF_ARGS (1/2)
A hello world UDF (init function):
#include "string.h"#include "stdio.h"#include "mysql.h"
my_bool hello_world_init( UDF_INIT *initid, UDF_ARGS *args, char *msg){ if (args->arg_count != 1) { memcpy(msg, "Missing message argument.", 26); return 1; } if (args->arg_type[0] != STRING_RESULT) { args->arg_type = STRING_RESULT; } return 0;}
Using UDF_ARGS (2/2)
A hello world UDF (row-level function):
char *hello_world( UDF_INIT *initid, UDF_ARGS *args, char *result, unsigned long *length, my_bool *is_null, my_bool *error){ char *arg_val = args->args[0]; char *arg_name = args->attributes[0]; char first = arg_name[0]; my_bool is_lit = first == '\'' || first == '"'; my_bool is_same = !strcmp(arg_value, arg_name); char *greeting = is_literal||is_same?"Hello":arg_name; *length = args->lengths[0] + strlen(greeting) + 3; if (*length > 255) { *is_null = 1; return NULL; } sprintf(result, "%s %s!", greeting, arg_value); return result;}
The UDF Repository
http://www.mysqludf.org/
Some highlightslib_mysqludf_myxml
lib_mysqludf_sys
lib_mysqludf_stem
lib_mysqludf_preg
For UDF authors:http://www.mysqludf.org/lib_mysqludf_udf/
My Latest Baby
https://github.com/rpbouman/mysqlv8udfs
mysql> SELECT js('var y = parseInt(arguments[0].substr(0,4), 10), a = y % 19, b = Math.floor(y / 100), c = y % 100, d = Math.floor(b / 4), e = b % 4, f = Math.floor((b + 8) / 25), g = Math.floor((b - f + 1) / 3), h = (19 * a + b - d - g + 15) % 30, i = Math.floor(c / 4), k = c % 4, L = (32 + 2 * e + 2 * i - h - k) % 7, m = Math.floor((a + 11 * h + 22 * L) / 451), n = h + L - 7 * m + 114; y + "-" + (Math.floor(n/31)) + "-" + ((n%31)+1);'), NOW());
Recommended Reading
http://rpbouman.blogspot.nl/search?q=UDF