GeoIPFunctions UDX

<< Click to Display Table of Contents >>

Navigation:  Integrating Blaze with Other Systems > Transforming Data in s-Server >

GeoIPFunctions UDX

Previous pageReturn to chapter overviewNext page

The GeoIPFunctions UDX lets you determine the country, region, city, latitude, and longitude associated with IP addresses worldwide, along with the postal code and metro code for US IP addresses. It does so using the MaxMind GeoIp City database. s-Sever ships with a free version of this database. This topic describes how the plugin finds the MaxMind data file and how one should use the licensed version for production. You can use this free default database for testing, but for production use, you should use at least an up-to-date GeoLiteCity.dat, or probably a licensed data file.

See http://www.maxmind.com/en/city for more details on the MaxMind GeoIp City database.

Locating the Database File to Use for Lookups

1.A licensed city data file, at the default location /usr/local/share/GeoIP/GeoIPCity.dat If you install a licensed GeoIPCity.dat file in this location, the plugin will use that instead of the GeoLiteCity database.
2.A data file at the fully-qualified path specified by the database.path entry in the properties file geoip.properties in the package com.sqlstream.plugin.geoip. If you do not override this, the file points to a GeoLiteCity.dat file in the plugin jar file. You can override this with a custom properties file that points to any data file you prefer. The GeoLiteCity.dat file is provided in the plugin jar file.

Creating a function to call GeoIPFunctions

The following code creates and sets a test schema and loads the GeoIP Plugin Jar.

--- create a test schema

CREATE OR REPLACE SCHEMA "test";

SET SCHEMA '"test"';

 

-- load the GeoIp Plugin Jar

CREATE OR REPLACE JAR "GeoIp" LIBRARY 'file:plugin/geoip/geoip.jar'

OPTIONS(0);

 

Defining Functions to Accept the IP Address and Return Individual Columns

Once you have converted the IP address to a BIGINT, you can use functions to return individual columns from the GeoIP database. In the following functions, you can use the IP address directly, as in 18.9.22.169.

CREATE OR REPLACE FUNCTION "getCityFromAddress" (address VARCHAR(30))

RETURNS VARCHAR(32)

LANGUAGE JAVA

PARAMETER STYLE SYSTEM DEFINED JAVA

NO SQL

EXTERNAL NAME '"GeoIp":com.sqlstream.plugin.geoip.GeoIp.getCityFromAddress';

 

CREATE OR REPLACE FUNCTION "getCountryFromAddress" (address VARCHAR(30))

RETURNS VARCHAR(34)

LANGUAGE JAVA

PARAMETER STYLE SYSTEM DEFINED JAVA

NO SQL

EXTERNAL NAME '"GeoIp":com.sqlstream.plugin.geoip.GeoIp.getCountryFromAddress';

 

CREATE OR REPLACE FUNCTION "getPostalCodeFromAddress" (address VARCHAR(30))

RETURNS VARCHAR(12)

LANGUAGE JAVA

PARAMETER STYLE SYSTEM DEFINED JAVA

NO SQL

EXTERNAL NAME

'"GeoIp":com.sqlstream.plugin.geoip.GeoIp.getPostalCodeFromAddress';

 

Example invocations of the above functions

values("test"."getCityFromAddress"(18.9.22.169));

values("test"."getCountryFromAddress"(18.9.22.169));

values("test"."getPostalCodeFromAddress"(18.9.22.169));

 

Converting IP Address to BIGINT

Some of the functions require you to convert the IP address into a form usable by the plugin. The following routine takes the 4 numeric portions of an IP address in dotted notation and produces a single BIGINT value.

CREATE OR REPLACE FUNCTION "ip4num"(ip1 VARCHAR(3), ip2 VARCHAR(3), ip3 VARCHAR(3), ip4 VARCHAR(3))

   RETURNS BIGINT

   CONTAINS SQL

   DETERMINISTIC

   RETURN

       (CAST(ip1 AS BIGINT) * 256 * 256 * 256)

           + (CAST(ip2 AS INTEGER) * 256 * 256)

           + (CAST(ip3 AS INTEGER) * 256)

           + CAST(ip4 AS INTEGER);

 

CREATE OR REPLACE FUNCTION "getCityFromIP" (ipNum BIGINT)

   RETURNS VARCHAR(132)

   LANGUAGE JAVA

   PARAMETER STYLE SYSTEM DEFINED JAVA

   NO SQL

   EXTERNAL NAME '"GeoIp":com.sqlstream.plugin.geoip.GeoIp.getCityFromIP';

 

CREATE OR REPLACE FUNCTION "getCountryFromIP" (ipNum BIGINT)

   RETURNS VARCHAR(132)

   LANGUAGE JAVA

   PARAMETER STYLE SYSTEM DEFINED JAVA

   NO SQL

   EXTERNAL NAME '"GeoIp":com.sqlstream.plugin.geoip.GeoIp.getCountryFromIP';

 

CREATE OR REPLACE FUNCTION "getPostalCodeFromIP" (ipNum BIGINT)

   RETURNS VARCHAR(132)

   LANGUAGE JAVA

   PARAMETER STYLE SYSTEM DEFINED JAVA

   NO SQL

   EXTERNAL NAME '"GeoIp":com.sqlstream.plugin.geoip.GeoIp.getPostalCodeFromIP';

 

Example invocations of the above functions

values("test"."ip4num"('184','105','74','90'));

values("test"."getCityFromIP"(3093908058));

values("test"."getCountryFromIP"(3093908058));

values("test"."getPostalCodeFromIP"(3093908058));

 

Defining a Function for the GeoIP to Return All Stream Columns

The following code defines a function for the GeoIp UDX to take a stream and a named column as input and returns all stream columns plus 6 additional columns which correspond to the CountryCode, Country Name, City, Region, Latitude, and Longitude for the IP address.

CREATE OR REPLACE FUNCTION "addIpLocation" (c CURSOR, ipColName VARCHAR(25))

   RETURNS TABLE(c.*,

       "countryCode" CHAR(2),

       "countryName" VARCHAR(34),

       "city" VARCHAR(32),

       "region" CHAR(2),

       "latitude" float,

       "longitude" float)

   LANGUAGE JAVA

   PARAMETER STYLE SYSTEM DEFINED JAVA

   NO SQL

   EXTERNAL NAME '"GeoIp":com.sqlstream.plugin.geoip.GeoIp.addLocation';

 

Next, you define a simple stream used in the examples below

CREATE OR REPLACE STREAM "in"

   ("ipAsText" VARCHAR(32),

    "ipAsNum" BIGINT

   )

DESCRIPTION 'example input stream to GeoIP Plugin'

;

 

The following code invokes the GeoIp UDX and uses the ipAsNum column as input:

CREATE OR REPLACE VIEW "Example1"

DESCRIPTION 'example invocation of GeoIP Plugin' AS

   SELECT STREAM *

   FROM STREAM("test"."addIpLocation"(CURSOR(SELECT STREAM * FROM "test"."in"),'ipAsNum'))

;

 

The following code parses a dotted notation IP address into the component parts so it can be used in the second example invocation for the GeoIp UDX.

CREATE OR REPLACE VIEW "Parse"

DESCRIPTION 'example parsing of IP addresses' AS

   SELECT STREAM "ipAsText",

                 "test"."ip4num"(VCLP.R.ip1, VCLP.R.ip2, VCLP.R.ip3, VCLP.R.ip4) AS "ipParsed"

   FROM (

         SELECT STREAM *,

                       VARIABLE_COLUMN_LOG_PARSE("ipAsText", 'ip1, ip2, ip3, ip4' , '.') AS R

         FROM "in"

        ) AS VCLP

;

 

The following code invokes the GeoIp UDX and uses the ipAsText column which is parsed using Variable Column Log Parser and then passed to the ip4num function to produce the BIGINT required by the GeoIp UDX.

CREATE OR REPLACE VIEW "Example2"

DESCRIPTION 'example invocation of GeoIP Plugin' AS

   SELECT STREAM *

   FROM STREAM("test"."addIpLocation"(CURSOR(SELECT STREAM * FROM "Parse"),'ipParsed'))

;

Next, write a stream to invoke the code above

!set incremental true

!set maxwidth 512

 

select stream * from "Example2";

 

Next, in another sqlline session, enter the following:

-- insert into "test"."in" values ('184.105.74.90',3093908058);

-- insert into "test"."in" values ('124.105.74.90',3093908058);

 

In the original terminal, the following result will return:

+----------------+-------------+-------------+---------------+---------+--------+--------------------+---------------------+

|  ipAsText      |  ipParsed   | countryCode | countryName   | city    | region | latitude           | longitude           |

+----------------+-------------+-------------+---------------+---------+--------+--------------------+---------------------+

| 184.105.74.90  | 3093908058  | US          | United States | Fremont | CA     | 37.5155029296875   | -121.89619445800781 |

| 124.105.74.90  | 2087275098  | PH          | Philippines   | Cavite  | B6     | 14.477798461914062 | 120.8931884765625   |