FERARI: Flexible Event Processing for Big Data Architectures

Contract Information

Programme: SEVENTH FRAMEWORK PROGRAMME
Programme Acronym: FP7-ICT
Contract Type: SPECIFIC TARGETED RESEARCH PROJECT
Start Date: 2014-02-01
End Date: 2017-01-31
Contract No: 619491
Role for SoftNet: contractor
Funding for SoftNet:  
Principal Investigator for SoftNet: Minos Garofalakis

Project Information

Official Web Site: http://www.ferari-project.eu/ 

In recent years the amount of data generating devices is growing rapidly: mobile phones, sensors in cars, smart home devices, or industrial machines. In consequence, exponentially growing amounts of data arising from these sources can be stored and processed. Many of today’s Big Data technologies were built on the tacit assumption of web-based systems processing data generated ultimately by humans. Human-generated data is predominantly persistent, i.e. is required to be stored for relatively long periods of time. As a result, Big Data technologies to date mainly focus on batch processing of data stored on distributed file systems. As Big Data finds its way to other business areas, this design decision becomes limiting. An area with great future potential is machine-to-machine interaction (M2M), and the Internet of Things. This, however, requires processing of massive and predominantly transient data streams. Consequently, current Big Data technology is inadequate for processing contemporary and expected amounts of M2M, and similar data. The FERARI vision. The goal of the FERARI project is to address these bottlenecks and to pave the way for efficient and timely processing of Big Data. We intend to exploit the structured nature of M2M data while retaining the flexibility required for handling unstructured data elements. Taking into account the structured nature of the data will enable business users to express complex tasks, such as efficiently identifying sequences of events over distributed sources with complex relations, or learning and monitoring sophisticated abstract models of the data. These tasks will be expressed in a high-level declarative language (as opposed to programming them manually as is the case in current streaming systems). The system will be able to perform these tasks in an efficient and timely manner.