Writing a UDF

<< Click to Display Table of Contents >>

Navigation:  Integrating Blaze with Other Systems > Transforming Data in s-Server >

Writing a UDF

Previous pageReturn to chapter overviewNext page

SQLstream requires no streaming extension of SQL:2008's SQL/OLB standard for UDFs, also known as "external routines". Farrago supports this standard; please refer to the External Routines section at http://farrago.sourceforge.net/design/UserDefinedTypesAndRoutines.html.

Creating user-defined functions can be a multi-step process, especially when implemented externally via Java code.

To help explain things, let's examine the process of creating a few user-defined functions to help deal with Global Positioning System (GPS) coordinates for use in geolocation.

Creating UDFs in SQL

We'll begin by creating user-defined functions in SQL that help us manipulate measurements in degrees.

First, we create a new schema to handle our extensions and set the schema and path to point to it:

CREATE SCHEMA geolocate_sql;

SET SCHEMA 'geolocate_sql';

SET PATH 'geolocate_sql';

 

Next, we create a function that allows us to extract the fractional portion of a degree measurement.

CREATE FUNCTION get_fraction( degrees DOUBLE )

   RETURNS DOUBLE

   CONTAINS SQL

   RETURN degrees - FLOOR(degrees)

;

 

Next, we use get_fraction to build a UDF that extracts minutes from a degree measurement:

CREATE FUNCTION get_minutes( degrees DOUBLE )

   RETURNS INTEGER

   CONTAINS SQL

   RETURN CAST( (get_fraction(degrees) * 60.0) as INTEGER )

;

 

With both of these functions in hand, we can build yet another UDF to extract seconds:

CREATE FUNCTION get_seconds( degrees DOUBLE )

   RETURNS INTEGER

   CONTAINS SQL

   RETURN CAST( ((get_fraction(degrees) * 3600.0) - ((get_minutes(degrees) * 60)) )

                 as INTEGER )

;

 

Now that we have our extraction functions in place, let's create one more function that brings them all together to convert from degrees in decimal format to a fully specified string indicating degrees, minutes, and seconds:

CREATE FUNCTION to_dms( degrees DOUBLE )

   RETURNS VARCHAR(20)

   CONTAINS SQL

   RETURN

       ( CAST( CAST(degrees AS INTEGER) AS VARCHAR(20) ) ||

       'd ' ||

       CAST( get_minutes(degrees) AS VARCHAR(20) ) ||

       'm ' ||

       CAST( get_seconds(degrees) AS VARCHAR(20) ) ||

       's' )

;

 

Creating UDFs in Java

In the previous section, we created four user-defined functions using SQL. Though SQL is perfectly sufficient for these functions, each of them could also have been created using Java. For the sake of comparison, we'll do that now.

First, we create a new schema to handle our extensions and set the schema and path to point to it:

CREATE SCHEMA geolocate_java;

SET SCHEMA 'geolocate_java';

SET PATH 'geolocate_java';

 

Second, we need a Java class that implements the conversions. To do so, we create a Degree.java file containing the following code under a com/sqlstream/examples/geolocate directory:

package com.sqlstream.examples.geolocate;

public class Degree {

   public static double getFraction( double degrees ) {

       return( degrees - Math.floor(degrees) );

   }

   public static int getMinutes( double degrees ) {

       return( (int)(getFraction(degrees) * 60.0) );

   }

   public static int getSeconds( double degrees ) {

       return (int)(((getFraction(degrees) * 3600.0)

                      - ((getMinutes(degrees) * 60)) ) );

   }

   public static String toDegMinSec( double degrees) {

int degs = (int)degrees;

 double mins = getMinutes(degrees);

double secs = getSeconds(degrees);

return(degs + "deg " + mins + "m " + secs + "s");

   }

}

 

Third, after compiling Degree.java, we need to package the Java class up into a JAR file so that SQLstream s-Server can load it. Create the jar from the directory containing com/sqlstream/examples/geolocate

jar cf gps.jar com

 

Fourth, we move the newly create gps.jar file to the machine SQLstream s-Server is running on (in the example below, we assume it is in /home/aspen/gps.jar), and use CREATE JAR to install the JAR into SQLstream s-Server:

CREATE JAR "GPSFuncsJavaLib"

   LIBRARY 'file:/home/aspen/gps.jar'

   OPTIONS(0);

--- This code loads the installed Java JAR file

       

At this point, we are ready to go ahead and use CREATE FUNCTION to make these routines available from SQL ("EXTERNAL NAME" is explained in CREATE FUNCTION):

CREATE FUNCTION get_fraction( degrees DOUBLE )

   RETURNS DOUBLE

   LANGUAGE JAVA

   NO SQL

   EXTERNAL NAME 'GPSFuncsJavaLib:com.sqlstream.examples.geolocate.Degree.getFraction'

;

CREATE FUNCTION get_minutes( degrees DOUBLE )

   RETURNS INTEGER

   LANGUAGE JAVA

   NO SQL

   EXTERNAL NAME 'GPSFuncsJavaLib:com.sqlstream.examples.geolocate.Degree.getMinutes'

;

CREATE FUNCTION get_seconds( degrees DOUBLE )

   RETURNS INTEGER

   LANGUAGE JAVA

   NO SQL

   EXTERNAL NAME 'GPSFuncsJavaLib:com.sqlstream.examples.geolocate.Degree.getSeconds'

;

CREATE FUNCTION to_dms( degrees DOUBLE )

   RETURNS VARCHAR(20)

   LANGUAGE JAVA

   NO SQL

   EXTERNAL NAME 'GPSFuncsJavaLib:com.sqlstream.examples.geolocate.Degree.toDegMinSec'

;

 

So far, all we've done is duplicate user defined functions in Java that we could have created in SQL. Java functions, however, allow access to a much richer set of capabilities.

As an example, let's create a user-defined function that calculates the great-circle distance in miles between two geographic locations, each of which is measured in degrees latitude and longitude. We'll do this using a simple form of the great-circle distance equation, calculating the central angle using the spherical law of cosines, and multiplying by the earth's radius in miles. This requires the use of cosine, sine, and arcosine function which do not exist in SQL, but do exist in the java.lang.Math package.

Create a Waypoint.java in the same directory as Degree.java containing the following code:

package com.sqlstream.examples.geolocate;

import java.lang.Math;

public class Waypoint {

   public static double getDistanceMiles( double lat1,

                                  double long1,

                                  double lat2,

                                  double long2 ) {

       -- Use the Great Circle Distance Formula

       -- to calculate distance in miles

double rlat1 = Math.sin(Math.toRadians(lat1));

double rlat2 = Math.sin(Math.toRadians(lat2));

double rlong1 = Math.sin(Math.toRadians(long1));

double rlong2 = Math.sin(Math.toRadians(long2));

double cangle = Math.acos( Math.sin(rlat1) * Math.sin(rlat2) +

                                  Math.cos(rlat1) * Math.cos(rlat2) *

                                  Math.cos(rlong2-rlong1) );

       return ( 3963.0 * cangle );

   }

}

 

Compile Waypoint.java, and create a new waypoint.jar file containing both Degree and Waypoint:

 jar cf gps.jar com

 

Now, drop and recreate the jar file.

DROP JAR GPSFuncsJavaLib

 OPTIONS(0)

 CASCADE;

CREATE JAR GPSFuncsJavaLib

 LIBRARY 'file:/home/aspen/gps.jar'

 OPTIONS(0);

 

Finally, use CREATE FUNCTION to access the new Java extension:

CREATE FUNCTION find_distance

                 ( lat1 DOUBLE,  long1 DOUBLE, lat2 DOUBLE, long2 DOUBLE )

   RETURNS DOUBLE

   LANGUAGE JAVA

   NO SQL

   EXTERNAL NAME

     'GPSFuncsJavaLib:com.sqlstream.examples.geolocate.Waypoint.getDistanceMiles'

;

Note that dropping the jar file also required dropping all of the functions that depend on it, which we accomplished using the CASCADE keyword. The degree functions can be recreated using the same CREATE FUNCTION calls we used the first time around; that step is omitted here for brevity.

Calling Java UDFs from SQL UDFs:

Finally, it's worth noting that a UDF defined in SQL can call a UDF defined in Java:

CREATE FUNCTION to_miles( lat1 DOUBLE,  long1 DOUBLE, lat2 DOUBLE, long2 DOUBLE )

   RETURNS VARCHAR(30)

   CONTAINS SQL

   RETURN ( CAST( find_distance(lat1, long1, lat2, long2) AS VARCHAR(20) )

           || ' miles'  )

;