StreamLab Projects Overview
A StreamLab project is a set of StreamLab sources, external connections, sinks, and pipeline guides. You can create projects for different use cases, such as monitoring log on activity to a web site, tracking vehicle speeds across a bus system, or measuring HTTP requests on a server. StreamLab contains several built-in StreamApps that you can use as the basis for projects. See StreamLab Galleries.
StreamLab projects have settings that let you change the project's name and schema, and manage how the project handles streaming data. See Using Project Settings below.
Sources are log files, Kafka topics, JSON files, web feeds, external database tables, s-Server streams and tables, and so on. They capture data from web feeds, sensors, message buses, network feeds, applications, databases, and other sources. You can parse local log files--files reachable from s-Server--through StreamLab.
External Connections are databases or other data sources external to SQLstream s-Server. Once you set up an external connection, you can read and write to such data sources from StreamLab using a sink.
Sinks are destinations for rows of data, usually an external file system, message bus, or database. In s-Server, a sink consists of a stream and a pump to fill it (a pump moves data from one location to another. Internally, StreamLab uses sinks to connect pipeline guides with each other.
Pipeline Guides are collections of commands, suggestions, and scripts that you use to generate SQL views on your data sources. You can view and export the SQL generated by pipeline guides. See the topic StreamLab Pipeline Guides Overview in this guide for more details.
Projects can named, saved, and reopened. They have unique URLs, which you can share with others. Project names and user names will be appended to the StreamLab URL, as in the following:
Generally, you will want to have multiple StreamLab projects to manage different aspects of your data. You may want to start with a single project and save a copy of it when you are satisfied with it, building up a set of StreamLab applications to examine different configurations of data.
StreamLab projects are listed on the StreamLab projects home page.
You can change the project's name, schema name, and adjust other settings through the Project Settings dialog. You can access this dialog box by clicking the Settings icon in the top left corner of the StreamLab page:
The Project Settings dialog lets you change the project's name, project schema name, manage how StreamLab handles queries on streams that are in use, handles throttling, and so on.
A schema is where project elements are "stored" in s-Server. By default, all the objects you create--pipeline guides, sinks, external connections, sources--are stored in the Project schema. The schema name is particularly important for developers who are accessing content that you create in StreamLab through s-Server.
Managing How StreamLab Handles Currently-Queried Streams
When a stream is being queried, it's not possible to change the stream with a SQL script (that is, StreamLab cannot submit a CREATE OR REPLACE STREAM script). By default, StreamLab asks you if it's okay to terminate these queries, but you can also choose to terminate these queries without asking or never terminate these queries.
Stop Queries Without Asking. StreamLab automatically terminates queries when you submit SQL for a currently-queried stream.
Ignore, Allowing SQL Scripts to Fail. StreamLab can automatically continue running queries and allow the submitted SQL to fail.
Ask for Permission to Stop Queries. This is the default behavior. With this setting enabled, when you submit SQL for a currently-queried stream, StreamLab will ask list currently-queried streams and ask your permission to terminate the query or queries:
When StreamLab terminates these queries, users viewing the dashboard using the query will see incoming data stop flowing. These users should should just save their changes to the dashboard and refresh the page.
Manage the SQL Run by StreamLab
By default, StreamLab runs the entire SQL script when you open it. You can deselect the Run the Complete SQL Script When Opened option to avoid running the entire script.
In order to use sources, StreamLab renders them in SQL. If you are not using sources, you can choose to leave them unrendered, which may improve performance in some cases. The Unattached Sources option lets you choose to leave these sources unrendered.
Sometimes, you may want to slow a data feed for testing purposes. In these cases, you can throttle your source--slow it to a specified number of rows per second. The default throttled rate is one row per second, but you can adjust this default rate by entering a different number in Project Standard Throttle Rate. You can also disable throttling for the project. You would most likely want to do so once you are ready to deploy a stream app. See throttling sources for more details.
Managing How the Scrutinizer Identifies Partition Keys
The Project Settings dialog box also lets you adjust settings related to the Scrutinizer. See the topic Managing the Scrutinizer for more details.
You can adjust what columns are identified as partition keys by changing Partition Key Unique Limit and Partition Key Length Limit.
Partition Key Unique Limit determines how limited a column needs to be--how many different values--in order to be identified as a partition key. If the Scrutinizer sees more than Partition Key Unique Limit it assumes the column probably isn't a partition key.
Partition Key Length Limit determines how many values can be in a Partition Key. If the Scrutinizer sees a string value longer than Partition Key Unique Limit, it assumes the column probably isn't a partition key. You can adjust these two values to shape which columns end up being marked.