+ All Categories
Home > Documents > Big Data StreamMining - Sunseed EU | Sustainable and...

Big Data StreamMining - Sunseed EU | Sustainable and...

Date post: 18-Jun-2018
Category:
Upload: buihuong
View: 215 times
Download: 0 times
Share this document with a friend
1
SUNSEED project is partially funded by EC FP7 programme under grant agreement #619437. Big Data Stream Mining Maintain summaries of the streams, sufficient to answer the expected queries about the data: Summaries can be in various forms: clusters (flat or hierarchic, statistical aggregates, …) Maintain a sliding window of the most recently arrived data operations on a sliding window mimic more traditional database/mining operations Sampling obtain representative data sample (i.e., enabling to perform correctly required operations on data) Smart sampling (x % from stream of multiple data sources; alternative take y % of selected data sources) Similarity comparison – smart indexing Incremental updating of predicting models M. Skrjanc, B. Kazic {Maja.Skrjanc, Blaz.Kazic}@ijs.si , Jozef Stefan Institute, Jamova ul. 39, Ljubljana, Slovenia Forecasting in Smart Grids Types of forecasting problems: Electricity load (short term, medium term, long term) Renewable sources generation Electricity prices Costumer segmentation Input sources: Historical load variables: used for learning models and detecting short term trends Meteorological data: known to be correlated with load (depends on location) Static data: such as special calendar data (holidays, summer season), and topology of electrical grid Methods used: Naive approach: Localized averages, previous values. Computationally non demanding, fast, robust and easy to maintain. Can work surprisingly well. Classical approaches: Autoregressive (ARMIA), regressionbased statistics methods. Based on historical data. Can take advantage of seasonality trends, but usually don’t include other data sources. Computational intelligence approaches: Artificial neural networks, support vector machines. Data driven approach that can take advantage of various heterogeneous data sources. Hybrid methods: combine two or more different approaches in order to take advantage of specific methods benefits and overcome their drawbacks.
Transcript

SUNSEED project is partially funded by EC FP7 programme under grant agreement #619437.

Big Data StreamMining• Maintain summaries of the streams, sufficient to answer the

expected queries about the data:• Summaries can be in various forms: clusters (flat orhierarchic, statistical aggregates, …)

• Maintain a sliding window of the most recently arriveddata operations on a sliding window mimic moretraditional database/mining operations

• Sampling ‐ obtain representative data sample (i.e., enabling toperform correctly required operations on data)

• Smart sampling (x % from stream of multiple datasources; alternative ‐ take y % of selected data sources)

• Similarity comparison – smart indexing• Incremental updating of predicting models

M. Skrjanc, B. Kazic{Maja.Skrjanc, Blaz.Kazic}@ijs.si , 

Jozef Stefan Institute, Jamova ul. 39, Ljubljana, Slovenia

Forecasting in Smart Grids• Types of forecasting problems:

• Electricity load (short term, medium term, long term)• Renewable sources generation• Electricity prices• Costumer segmentation

• Input sources:• Historical load variables: used for learning models and

detecting short term trends• Meteorological data: known to be correlated with load

(depends on location)• Static data: such as special calendar data (holidays,

summer season), and topology of electrical grid• Methods used:

• Naive approach: Localized averages, previous values.Computationally non demanding, fast, robust and easyto maintain. Can work surprisingly well.

• Classical approaches: Autoregressive (ARMIA),regression‐based statistics methods. Based on historicaldata. Can take advantage of seasonality trends, butusually don’t include other data sources.

• Computational intelligence approaches: Artificialneural networks, support vector machines. Data drivenapproach that can take advantage of variousheterogeneous data sources.

• Hybrid methods: combine two or moredifferent approaches in order to take advantageof specific methods benefits and overcometheir drawbacks.

Recommended