5Analytics Enterprise AI Platform Documentation

Quick Start

Here we will give you a quick introduction to working with the 5Analytics AI Platform. We will start with uploading a small script and querying it as a web service. After that, we will show how to write applications in R. For a more complete reference, see the programming guide.

You can download the latest version of the 5Analytics Enterprise AI Platform at http://download.5analytics.com/.

If you require a password or further assistance, please contact support. Besides installing the software, you can just pull the latest Docker image from Docker Hub.

> docker pull 5analytics/ada

Start/Stop and monitor the Server

To start and stop the server.

> /opt/ada/ada start

Start the Docker Image

> docker run -p 5050:5050 5analytics/ada

Now the software is up and running. You can login at http://localhost:5050/

For some versions of Docker on Windows the container hostname is not localhost but 192.168.99.100, so the URL is http://192.168.99.100:5050/.

Default Username and Password are:

Screen

Log files can be viewed in the Platform (Log File) or, if you are working locally, are stored in the root directory of the 5Analytics environment:

> head /opt/ada/logfile.log
> head /opt/ada/nohub.out

AI Applications

AI Applications are scripts that can be written in different scripting languages. Currently, JavaScript, R, Python and Scala are supported. These scripts can be accessed via Web-Services and perform computations on data and automate decision making.

The 5Analytics Enterprise AI Platform turns function definitions into Web-Services.

Functions to be used as web-services have to be named fafun_* to be considered as web-service.

 # create simple R script
> echo "fafun_curl <- function() { 2 }" > test.R

You can upload a file with the Web-Interface (Scripts) or with webdav.

 # get directory listing
> curl -u usr:pswd --digest 'http://localhost:5050/up/dav/'
 # upload file to server via webdav
> curl -u usr:pswd --digest -T test.R 'http://localhost:5050/up/dav/'

Now we can execute this code as a web service.

> curl "http://localhost:5050/if/json/R/v1/fafun_curl?_token=test_token"
{
  "null": [1.0]
}

More can be found in section Operations.

Introduction

This introductory guide provides a general overview of the 5Analytics Enterprise AI Platform. This guide also includes frequently asked questions about 5Analytics products and describes how to get support, report issues, and receive information about updates and new releases.

About the 5Analytics Enterprise AI Platform

The 5Analytics Enterprise AI Platform is a server that allows users to easily write Artificial Intelligence applications in R or JavaScript.

The server is designed as a micro service, providing different Web APIs to access the service.

Release Notes

This guide contains release and download information for installers and administrators. It includes release notes as well as information about versions and downloads. The guide also provides a release matrix that shows which major and minor release version of a product is supported with which release version of the 5Analytics Enterprise AI Platform.

Docker

For ease of use, we are providing ready to use Docker Images for the 5Analytics Enterprise AI Platform.

The latest version of the 5Analytics Enterprise AI Platform can be pulled from Docker Hub.

> docker pull 5analytics/ada

The platform can be started with the command:

> docker run -p 5050:5050 5analytics/ada:latest

or with an interactive shell

> docker run -p 5050:5050 -i -t 5analytics/ada:latest /bin/bash

Once the platform is started, it can be reached with a browser at http://localhost:5050/

The Docker Image comes with a number of Environment Variables that can be set to alter the way the platform starts


------------------------------------------------------------------------
Starting 5Analytics docker image 1.5-0
 
There are different ways to start the 5Analytics docker image.
 
 
 
Test and demo environment
                If you start the image without any parameters, it will run as a
                regular process.
 
        Example:
                docker run -p 5050:5050 5analytics/ada:latest
 
                If you start the image '/bin/bash' as only parameters, it will open
                a bash shell and allows to interact with the image.
                In this case, logging is written to /opt/ada/nohup.out.
 
        Example:
                docker run -p 5050:5050 -i -t 5analytics/ada:latest /bin/bash
 
 
 
Production environment
                In case of an production environment it might be helpful to store
                scripts and configuration files outside or the image. In such a case
                you should remap the directories '/opt/ada/etc' and '/opt/ada/var/code'.
 
        Example:
                docker run -p 5050:5050 -v /path/to/etc:/opt/ada/etc \ 
                        -v /path/to/scripts:/opt/ada/var/code 5analytics/ada:latest
 
 
The 5Analytics Docker Image supports the following environment variables

 ADA_USERSet the user for the image
 ADA_PSWD       Set the password for the user ADA_USER
 
 ADA_NCH        Set the Connect Node hostname (to start the image as worker)
 ADA_NCP        Set the Connect Node port
 
 ADA_HOST       Set the hostname
 ADA_PORT       Set the (internal) port
 ADA_APP_PORT   Set the application port
 ADA_HSH_PORT   Set the hash map port
 
More details can be found online: http://doc.5analytics.com/
------------------------------------------------------------------------

Install and Upgrade

This section describes the requirements for installing the 5Analytics Enterprise AI Platform.

The latest version of the 5Analytics Enterprise AI Platform can be downloaded from http://download.5analytics.com/.

Requirements

For running the 5Analytics Enterprise AI Platform the minimal requirement is a server with at least

Supported Operating Systems

Currenty the 5Analytics Enterprise Platform runs on Linux, supported platforms are SUSE SLES 12, OpenSuse Leap 42.1, Redhat Enterprise Linux 6 and 7 and CentOS 7.

Supported JDK Versions

Currently the only supported JRE/JDK is Oracle Java SE 8.

Supported R Versions and Packages

Currently R version 3.3.2 is supported.

Supported Python Versions and Packages

Currently Python version 2.7.13 and Python version 3.6.2 are supported.

Supported Scala Versions and Packages

Currently Scala version 2.12.4 is supported. There is no support for packages yet.

Supported Databases

The 5Analytics Enterprise AI Platform currently supports only PostgreSQL but all JDBC compliant database system can be used with the system.

Installation Overview

Please follow the appropriate link for your platform:

Administration

On Unix, the 5Analytics Enterprise AI Platform is run as a daemon named ada that executes continuously in the background to handle requests. This document describes how to invoke the daemon ada.

/opt/ada/ada

5Analytics AI Platform 1.5-0_beta

Usage: ada [--version] {start|stop|restart|reload|status|start-worker} [options]

Options can be
        -f | --file <config file>
                specifies the location of the config file
        -nd | --no-daemon
                prints all output to the console and not to
                a log file
        -h | --help
                prints this help/usage message
        -i | --intel
                loading Intel libraries

The following options only work in combination with the command <start-worker>
        -nch| --connect-node-host <connect-node-host>
                the hostname of the node to connect to.
        -ncp| --connect-node-port <connect-node-port>
                the port of the node to connect to.
        -ho | --host <host>
                the host of the script worker to be started.
        -p  | --port <port>
                the port of the script worker to be started.
        -ap | --application-port <application-port>
                the application port of the script worker to be started.
        -hp | --hash-map-port <hash-map-port>
                the hash map port of the script worker to be started.
        -he | --heap <heap>
                the heap of the script worker to be started.
        -wr | --write
                Writes the configuration of the new node to a file.
 
A detailed documentation can be found online: 
http://doc.5analytics.com/
 

The 5Analytics Enterprise AI Platform runs as unpriviledged user ada. You can, how ever, set the user with the environment variable SCRIPT_USER. SCRIPT_USER=tom /opt/ada start would start the server as the user tom.

The configuration of the 5Analytics Enterprise AI Platform is defined in the configuration file /opt/ada/etc/ada.xml

Scripting environments

<engines>
	<engine factory="de.visionstec.R.RScriptEngineFactory" key="R">
		<extensions>R</extensions>
		<parameter id="FA_SERIALIZE">false</parameter>
		<parameter id="LIBRARY_PATH">var/lib/R</parameter>
		<parameter id="WORKING_DIRECTORY">var/code</parameter>
	</engine>
	...
</engines>

In the example above, the R scripting environment is loaded. Currently, the following scripting environments are available:

The parameters that you can set for each engine alter how the engine is loaded. Some are specific to that engine, please refer to the documentation of each engine. Here is a list of generic parameters:

Cluster computing

The 5Analytics Enterprise AI Platform can be run in a cluster environment. The processes that belong to a cluster are defined in the configuration file. There are four different types of processes: web, network, storage and script.

The web process is in charge of offering the different web interfaces as well as the webdav upload interface.

The network process represents the connection layer that gets requests from web processes as well as from API libraries.

The storage process manages data access.

The script process is in charge of executing the requests.

Each process has an internal port port and an external port appPort. The internal port is for interal communication such as management or process discovery whereas the external port handles requests from other processes or applications.

For each process the amount of memory used heap can be defined. It defaults to 16m.

<nodes>
		<node type="web" create="true">
			<hostname>localhost</hostname>
			<port>7807</port>
			<appPort>5050</appPort>
		</node>

		<node type="network" create="true">
			<hostname>localhost</hostname>
			<port>7107</port>
			<appPort>7007</appPort>
			<heap>512m</heap>
		</node>


		<node type="storage" create="true">
			<hostname>localhost</hostname>
			<port>7606</port>
			<appPort>7506</appPort>
		</node>


		<node type="script" create="true">
			<hostname>localhost</hostname>
			<port>7106</port>
			<appPort>7006</appPort>
		</node>
	</nodes>

Authentication


	<authentication>

		<!-- the authentication for the upload service -->
		<upload>
			<auth class="de.visionstec.mbase.cluster.security.Authentication">
				<users>
					<user id="test_1" password="xyz_1" />
					<user id="test_2" password="xyz_2" />
					<user id="test_3" password="xyz_3" />
				</users>
			</auth>
			<method class="de.visionstec.mbase.cluster.web.DigestAuthentication"
				realm="test@test"
				method="auth">
			</method>
		</upload>

		<network>
			<auth class="de.visionstec.mbase.cluster.security.DummyAuthentication" />
		</network>

		<!-- the authentication for the web interfaces -->
		<interface>
			<auth class="de.visionstec.mbase.cluster.security.DummyAuthentication" />
			<method class="de.visionstec.mbase.cluster.web.SimpleTokenAuthentication">
				<token>test_token</token>
			</method>
		</interface>
	</authentication>

Data access

The 5Analytics Enterprise AI Platform lets you define data sources that can be used in your scripting environment. This way, you can access data sources as if the data would be in memory of your local environment.


		<datasource
			key="db"
			factory="de.visionstec.mbase.storage.virtual.JdbcTableFactory"
			connection="jdbc:postgresql:database"
			usr="***usr***"
			pswd="***pswd***"
			driver="org.postgresql.Driver">
			<table key="tab_1" ds="db" cache="false">
				<specification>select * from "table"</specification>
			</table>
			<table key="tab_1_lim" ds="db" cache="false">
				<parameter index="1" value="10" type="INT" />
				<specification>select * from "table" limit ?</specification>
			</table>
		</datasource>


Besides accessing existing tables or SQL-Queries you can also use the 5Analytics Enterprise AI Platform to join data from different data sources. The following definition defines a View which joins data from table test and testa, both defined as data sources as described above.


		<datasource
			key="view1"
			factory="view">
			<table key="test_view" ds="view1" cache="false">
				<specification>SELECT * FROM "test" t JOIN testa a ON t."key" = a."key"</specification>
			</table>
		</datasource>

Logging

5Analytics comes with a pre-built logging system. It can be configured based on the structure of the software.

There are currently two types of logging supported: file and syslog.

File logging

The file logger is set by default. It simply requires the path (absolute or relative to the platforms log directory var/log/) of the log file. If the file does not yet exist, it is being generated.
       <logging level="INFO" type="file" file="var/log/default.log">
                <name level="INFO">de.visionstec.mbase.cluster.threads</name>
        </logging>

Syslog logging

To enable syslog logging, you have to set the attribute type to syslog and file to the host/port of the syslog daemon. If you omit the port, the port is set to 514 by default.

       <logging level="INFO" type="syslog" file="tcp://host:1234">
                <name level="INFO">de.visionstec.mbase.cluster.threads</name>
        </logging>

Operation

File Upload

The first step when working with the 5Analytics Enterprise AI Platform is uploading a script to the platform. Files are uploaded into the system with WebDAV.

> curl "webdav://localhost:5050/up/dav/" 
 > cat test.R
fafun_ada1 <- function() { return (1); } ;
fafun_ada2 <- function(x) { return (x); }
fafun_ada3 <- function(x=1) { return (x); }

> curl -u test_1:xyz_1 --digest -T test.R 'http://localhost:5050/up/dav/test.R'

File upload on Windows

On Windows you can map a drive to the 5Analytics Enterprise AI Platform WebDAV folder. The following command maps the 5Analytics Enterprise AI Platform WebDAV drive on host hostname to a local drive. The drive is mapped with the credentials test_1 as user and xyz_1 as password.

> net use * http://hostname:5050/up/dav/ xyz_1 /user:test_1

On some systems it might be necessary to activate the WebClient service.

> sc config "WebClient" start=auto
> net start WebClient

Interfaces

The 5Analytics Enterprise AI Platform provides a number of interfaces to be accessed by 3rd party applications.

JSON, REST and SOAP

With the JSON API, response objects are encoded as JSON objects. They can be easily interpreted by a JavaScript JSON parser.

Here an example for R

> curl "http://localhost:5050/if/json/R/v1/fafun_ada2?x=1&_token=test_token"
{
  "null": [1.0]
}

and one for Python

> curl "http://localhost:5050/if/json/python/v1/fafun_ada2?x=1&_token=test_token"
{
  "null": 1
}
function loadJSON(string, callback) {
	var xhttp = new XMLHttpRequest();
	xhttp.onreadystatechange = function() {
		if (xhttp.readyState == 4 && xhttp.status == 200) {
			callback(xhttp.responseText);
		}
	};
	xhttp.open("GET", string, true);
	xhttp.send();

	console.log("Calling ".concat(string))
}

loadJSON(
	"http://localhost:5050/if/json/R/v1/fafun_ada2?x=1&_token=test_token",
	function(response) {
		var _obj = JSON.parse(response);
		var _factor = (_obj['data'])[0];

		console.log(_factor);

		// do something with the response
	});

Writing AI Code

Function parameters that start with an underscore are reserved for FiveA internal use.

R-Package

The 5Analytics R Package comes with a number of functionality for AI and data access.

Data access

5Analytics comes with a data virtualization layer. This layer allows users to access data stored in external data sources as if they were local R objects.

For example if we have a table Movies stored in a SQL database. This table can be referenced in the 5Analytics Enterprise AI Platform with the name movies as described in the following.

> library(FiveA)
Loading required package: rJava

Attaching package: ‘FiveA’

The following object is masked from ‘package:base’:

    table

> mov <- fiveA("localhost",7506,table="movies")
# let's have a look at the object
> mov
5Analytics Object movies on localhost:7506
# get the column names
> names(mov)
[1] "movie_id" "year"     "title"
# access data stored in table move
# this statement send the following SQL request to the database
# SELECT * FROM Movies
> as.data.frame(mov)
      movie_id year
1            1 2003
2            2 2004
3            3 1997
4            4 1994
5            5 2004
...
...
# we can also filter the data. For example if we would like to filter
# according to the column 'year' we would write the following statement
# which would send the following SQL request to the database
# SELECT * FROM Movies where year > 2004
> mov[mov$year > 2004,]
    movie_id year
1         17 2005
2         85 2005
3         91 2005
4        149 2005
5        151 2005
...
...
# this can become arbitrary complex the statement below will yield
# the SQL statement SELECT * FROM Movies where year > 2004 AND title IS NOT NULL
> mov[mov$year > 2004 & !is.null(mov$movie_id),]
    movie_id year
1         17 2005
2         85 2005
3         91 2005
4        149 2005
5        151 2005
...
...
# or you could do wildcard matching which will yield
# the SQL statement SELECT * FROM Movies where year <= 1920 AND title LIKE '%Vol%'
> mov[mov$year <= 1920 & like(mov$title,"%Vol%"),]
  movie_id year                              title
1     9001 1915 Chaplin"s Essanay Comedies: Vol. 3
2     9810 1917            Chaplin Mutuals: Vol. 1
3    12657 1919    Chaplin: The Collection: Vol. 5
...
...
# of course you can also use commonly used functions such as summary or table
# on a 5Analytics object
> table(mov[,"year"])

   0 1896 1909 1914 1915 1916 1917 1918 1919 1920 1921 1922 1923 1924 1925 1926
   7    1    1    2    5    4    3    2    8    6    9    6    2    4   12    9
1927 1928 1929 1930 1931 1932 1933 1934 1935 1936 1937 1938 1939 1940 1941 1942
  13   10   10   13   13   22   15   26   16   29   27   21   37   34   33   33
1943 1944 1945 1946 1947 1948 1949 1950 1951 1952 1953 1954 1955 1956 1957 1958
...
...
>

These examples work on any data source. There are, how ever, some caveats. The like feature for example on ElasticSearch operates on terms. If you use it to query an analyzed field, it will examine each term in the field, not the field as a whole. Terms are stored in lower case, this means the match is case insensitive.

In some situations it might become necessary to parameterize data access. For example this might be the case for filters on group by statements in SQL. Here we can use parameters that can be used to alter the SQL statement. Lets look at the following statement - we can reference it with movies_group

 select year, COUNT(*) from movies WHERE year > ? AND year < ? GROUP BY year 

Here we would have two parameters that need to be set. Initially they are set with default values but we can alter those values inside an R script with the parameter function.

> > mov_grp <- fiveA("localhost",7506,table="movies_group")
> mov_grp
5Analytics Object movies on localhost:7506
> parameter(mov_grp)
NULL
> parameter(mov_grp) <- c(1999,2001)
> parameter(mov_grp)
[1] 1999 2001
> parameter(mov_grp,param=c(1999,2001))
> names(mov_grp)
[1] "year"  "count"
> as.data.frame(mov_grp)
...
... DATA
...
>

Python-Package

Data access

5Analytics comes with a data virtualization layer. This layer allows users to access data stored in external data sources as if they were local Python objects.

For example if we have a table Numbers stored in a SQL database. This table can be referenced in the 5Analytics Enterprise AI Platform with a FiveA object numbers as described in the following.

> from fivea import FiveA
> numbers = FiveA('localhost', 7506, 'Numbers')
INFO fivea 17-10-17 17:16:48 Create FiveA table 'Numbers' at localhost:7506

> numbers.columns() # get the column names
[u'id', u'name']

> len(numbers) # get the number of rows in the table
10

> numbers[:] # get all data (corresponding to SELECT * FROM Numbers)
[[0, u'null'], [1, u'eins'], [2, u'zwei'], [3, u'drei'], [4, u'vier'],
 [5, u'fünf'], [6, u'sechs'], [7, u'sieben'], [8, None], [9, u'neun']]

> numbers[4] # select a specific row by row number
[4, u'vier']

> numbers[:, 0] # select a specific column by id or name
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

> numbers[numbers.id > 5] # filter the data by column id (corresponding to SELECT * FROM Numbers where id > 5)
[[6, u'sechs'], [7, u'sieben'], [8, None], [9, u'neun']]

> numbers[(numbers.id >= 4) & (numbers.id < 5) | like(numbers.name, '%ei%') ] # Filters can become arbitrary complex
[[1, u'eins'], [2, u'zwei'], [3, u'drei'], [4, u'vier']]

> sum(fiveatable[:,0]) # you can also use aggregates or iterate over all entries
45
get_appGet our Community Edition
or visit our Homepage!