Download - Creating streams with DataSift
![Page 1: Creating streams with DataSift](https://reader036.vdocument.in/reader036/viewer/2022082512/5552811ab4c905b4598b4eee/html5/thumbnails/1.jpg)
Creating Streams with DataSift
![Page 2: Creating streams with DataSift](https://reader036.vdocument.in/reader036/viewer/2022082512/5552811ab4c905b4598b4eee/html5/thumbnails/2.jpg)
Creating a Stream: Workflow
Stream Specification
Stream Definition
Filtered Data
![Page 3: Creating streams with DataSift](https://reader036.vdocument.in/reader036/viewer/2022082512/5552811ab4c905b4598b4eee/html5/thumbnails/3.jpg)
Creating a Stream: Specification
What do you want the elements to contain?
What sources do you want the data to come from?
What is your budget for data acquisition? Who is this data for?
Work out what you want your stream to do
![Page 4: Creating streams with DataSift](https://reader036.vdocument.in/reader036/viewer/2022082512/5552811ab4c905b4598b4eee/html5/thumbnails/4.jpg)
Creating a Stream: Definition
Create Stream in DataSift
Create FSDL Definition
Verify with live data
Write a Stream Definition that executes your specification
![Page 5: Creating streams with DataSift](https://reader036.vdocument.in/reader036/viewer/2022082512/5552811ab4c905b4598b4eee/html5/thumbnails/5.jpg)
Creating a Stream: Filtered DataRetrieve the data that is filtered by your stream
JSON API HTTP Streaming
WebSockets Streaming RSS
![Page 6: Creating streams with DataSift](https://reader036.vdocument.in/reader036/viewer/2022082512/5552811ab4c905b4598b4eee/html5/thumbnails/6.jpg)
Creating a Stream in DataSift 1. Select the Create Stream button on any page on DataSift
![Page 7: Creating streams with DataSift](https://reader036.vdocument.in/reader036/viewer/2022082512/5552811ab4c905b4598b4eee/html5/thumbnails/7.jpg)
Creating a Stream in DataSift2. Fill in the title, description, and tags for your Stream
The Title and Description will be shown next to your StreamThe Tags will be used for search and categorisation of your Stream
Enabling the Private checkbox will make your Stream visible only to you
![Page 8: Creating streams with DataSift](https://reader036.vdocument.in/reader036/viewer/2022082512/5552811ab4c905b4598b4eee/html5/thumbnails/8.jpg)
Creating a Stream in DataSift3. Create your first stream definition
This is the Stream EditorThere is a default stream definition already inserted for you
Why not try changing “hello world” to a different value?e.g. interaction.content contains “cat”
![Page 9: Creating streams with DataSift](https://reader036.vdocument.in/reader036/viewer/2022082512/5552811ab4c905b4598b4eee/html5/thumbnails/9.jpg)
Creating a Stream in DataSift4. Hit the Save button
Your Stream is now savedYou can use the breadcrumbs to go back to see a live preview of the results
![Page 10: Creating streams with DataSift](https://reader036.vdocument.in/reader036/viewer/2022082512/5552811ab4c905b4598b4eee/html5/thumbnails/10.jpg)
FSDL: Filtered Stream Definition Language
FSDL is the language used to write Stream Definitions for DataSift
The language takes the following basic format:
<term> <logical operator> <term> <logical operator>
There must be a minimum of 1 term in a definition.
All terms must be separated by logical operators.
A logical operator is either “and” or “or”.
![Page 11: Creating streams with DataSift](https://reader036.vdocument.in/reader036/viewer/2022082512/5552811ab4c905b4598b4eee/html5/thumbnails/11.jpg)
FSDL: Nested RuleOn the previous slide, we had this definition outline:
<term> <logical operator> <term> <logical operator>
The term can be either one of a “nested rule” or a “predicate”.
A nested rule is a method of including the result of another stream within the logic of this one.
The syntax for a nested rule is:
rule “<stream identifier>”
Where the stream identifier is a 32-character alphanumeric string obtainable from the stream you wish to include’s page on DataSift, or through the API.
![Page 12: Creating streams with DataSift](https://reader036.vdocument.in/reader036/viewer/2022082512/5552811ab4c905b4598b4eee/html5/thumbnails/12.jpg)
FSDL: Nested Rule ExampleThis is an example of a simple FSDL definition:
interaction.content contains “justin bieber”
The Stream Identifier for this definition is 4e8e6772337d0b993391ee6417171b79. The stream will contain all content which contains “justin bieber” in its content.
We can create another rule to filter this down further, using the nested rule syntax:
rule “4e8e6772337d0b993391ee6417171b79” and language.tag == “en”
This performs the same filtering as the first stream, with the addition of only including content determined to be in English using the language.tag == “en” predicate.
In this case, the logical operator separating the two terms is “and”.
![Page 13: Creating streams with DataSift](https://reader036.vdocument.in/reader036/viewer/2022082512/5552811ab4c905b4598b4eee/html5/thumbnails/13.jpg)
FSDL: PredicatesPredicates are formed of 3 items, a target, operator and argument, in the following format:
<target> <operator> <argument>
In the previous example, we saw this predicate used to filter the results of another rule:
language.tag == “en”
In this example, the target is “language.tag”; the operator is “==“ (equals); and the argument is “en”.
There is a long list of targets, operators, and the arguments they require on the DataSift Support Documentation.
![Page 14: Creating streams with DataSift](https://reader036.vdocument.in/reader036/viewer/2022082512/5552811ab4c905b4598b4eee/html5/thumbnails/14.jpg)
FSDL: Example PredicatesThe following are some examples of some simple predicates:
interaction.content contains “#rdgtweetup”
twitter.user.friends_count >= 1000
interaction.content contains_word “net”
interaction.geo exists
author.username in "dtsn,nickhalstead,chris_alexander,datasift"
![Page 15: Creating streams with DataSift](https://reader036.vdocument.in/reader036/viewer/2022082512/5552811ab4c905b4598b4eee/html5/thumbnails/15.jpg)
FSDL: Example DefinitionsHere are examples of more complex definitions composed of multiple terms:
(interaction.content contains "Justin Bieber« OR interaction.content contains "Justin Beiber")
(interaction.content contains "Nokia"OR interaction.content contains "Motorola"OR interaction.content contains "Palm")AND interaction.content contains "phone“
interaction.content contains "#rdgfestival"OR interaction.content contains "#readingfestival"
OR rule "4315e367618830de6224c479f35db4ca"
![Page 16: Creating streams with DataSift](https://reader036.vdocument.in/reader036/viewer/2022082512/5552811ab4c905b4598b4eee/html5/thumbnails/16.jpg)
API CallsAPI calls are available to perform most of the DataSift functionality.
Stream
Get Create Update Duplicate Rate Delete List
Comments
Get Create Flag
All of these API calls are available through a semi-RESTful interface, in a similar way to the Twitter API.
Data formats supported include JSON, JSONP, XML and PHP (serialized).
Each call is fully documented on the DataSift Support site.
![Page 17: Creating streams with DataSift](https://reader036.vdocument.in/reader036/viewer/2022082512/5552811ab4c905b4598b4eee/html5/thumbnails/17.jpg)
Retrieving Stream DataOnce you have configured your stream with a definition and verified it is correct, you can connect to your stream through a number of methods:
JSON API
HTTP Stream
WebSockets Stream
RSS
The JSON API is simple and similar to how you would access Twitter Search.
The HTTP Stream is similar to the Twitter firehose, giving a constant stream of data through a single connection. WebSockets is similar to this but meant for client-side connections through supported web browsers.
RSS is also available, recommended for lower volume feeds only.
All services are fully documented on the DataSift Support site.
![Page 18: Creating streams with DataSift](https://reader036.vdocument.in/reader036/viewer/2022082512/5552811ab4c905b4598b4eee/html5/thumbnails/18.jpg)
Questions
You can get more help, support, examples and user content on the DataSift Support website:
http://support.datasift.net
You can also ask us on Twitter:@datasift