Logstash on Windows–installation and I/O examples

Part 1: http://msprogrammer.serviciipeweb.ro/2016/12/05/logstash-on-windowsinstallation-and-io-examples/ 

Part 2 : http://msprogrammer.serviciipeweb.ro/2016/12/12/logstash-on-windows-transformation-of-data/

I was very impressed by the declaration of logstash :”Centralize, Transform & Stash Your Data”  . What I think it does it receives, transforms  and outputs data- and it does very configurables.

I will start with some easy examples .

Install

Download the logstash from https://www.elastic.co/downloads/logstash  -  there is a zip file. Download and unblock , then extract all contents .A folder will be created ( the name of my folder is logstash-5.0.0 ). Open a command prompt and run

logstash-5.0.0\bin\logstash

The answer could be several :

1. PC is missing java – go , download, unblock, execute

2. It says something about missing “ server “ folder on java installation – goto your Java installation folder and copy “client” folder to “server”

3. It says “ ERROR: No configuration file was specified “ -  Ok, it is perfect!

Execution

Logstash need input and output to be configured – because it transforms any “input” into any “output” that he knows( via plugins)

Ok, now let’s do a working example – reading and writing to console :

logstash-5.0.0\bin\logstash -e ‘input { stdin { } } output { stdout {} }’

Now when you write something such as

asdasdasd

,the answer will be

2026-11-05T20:11:30.883Z ANDREIPC asdasdasd

Well, this is the first transformation – console to console.

Let’s make something more complicated – now I want to read from tcp port 9000 and output to console – but to see the whole message. For this we will create a file, named tcp.txt, with the following content:

input {
  tcp {
    port => 9000
    type => "tcpLog"
  }
}
  output {
stdout {codec => rubydebug}

}

( the code is for seeing more details about the message  ) And we will run

logstash-5.0.0\bin\logstash -f tcp.txt

Somewhere logstash should say:

Starting tcp input listener {:address=>"0.0.0.0:9000"}

In a separate window, I will start

telnet 127.0.0.1 9000

and enter the same text

asdasd

The answer will be:

{
    "@timestamp" => 2026-11-05T20:27:18.047Z,
          "port" => 51037,
      "@version" => "1",
           "host" => "127.0.0.1",
       "message" => "asdasd\r",
          "type" => "tcpLog"
}

And it is more clear now -  we jave telnet=> console.

Let’s say that now we want to write the output a file. I will modify tcp.txt to add to output the file plugin :

input {
  tcp {
    port => 9000
    type => "tcpLog"
  }
}
  output {
stdout {codec => rubydebug}
file {    path => "a.txt"     }
}

We start again logstash with

logstash-5.0.0\bin\logstash -f tcp.txt

and the telnet console with

telnet 127.0.0.1 9000

and enter the same text

asdasd

The answer will be now:

{
    "@timestamp" => 2026-11-05T20:31:47.639Z,
          "port" => 51213,
      "@version" => "1",
          "host" => "127.0.0.1",
       "message" => "asdasdad\r",
          "type" => "tcpLog"
}
[2026-11-05T22:31:48,534][INFO ][logstash.outputs.file    ] Opening file {:path=>"a.txt"}

For more outputs(such as csv , http. mongodb and others) , please see https://www.elastic.co/guide/en/logstash/current/output-plugins.html

For more inputs(such as file, http, github and others) please see  https://www.elastic.co/guide/en/logstash/current/input-plugins.html

Async + sync completion + lock

Let’s suppose that you have 2 functions = one true sync and one true async . Let’s suppose more that the sync one can must finish before calling next time ( think about some weird COM component or , in our days, SqlConnection )

If we look just at the async function , we could wait for all. However, the sync function must be called sync

One solution is to make different calls:


var list = new List<Task<int>>(nr);
//add async to list

await Task.WhenAll(list)

//call each sync



But I want to make really async – and one solution is lock

We could wrote the sync function in async manner like this:


//SemaphoreSlim sem = new SemaphoreSlim(1);
static object myObject = new object();
async Task<int> syncTask(int i)         {             
lock (myObject)             
//try{                 
//await sem.WaitAsync();                                 
CallSyncFunction();                             
//}             
//finally             
//{                 
//sem.Release();             
//}                                   
}

The trick that I  used is lock, because the syntax is easier than SemaphoreSlim (BTW: if you want to learn about threading, jump directly to http://www.albahari.com/threading/)
In this manner, the sync function is modified in async -and, more, it waits for completion before entering the second time.

The code is on github on

More details in the Wiki https://github.com/ignatandrei/AsyncSyncLock/wiki or in the code https://github.com/ignatandrei/AsyncSyncLock/

Pivot / Transform rows from one column into multiple columns with R in SqlServer

Let’s say we have a table  [CurrencyValues] with the following datas

CurrencyID CurrencyValue ValueDate
CAD 2.9929 9/27/2016 0:00
EUR 4.4495 9/27/2016 0:00
GBP 5.1205 9/27/2016 0:00
USD 3.9567 9/27/2016 0:00
CAD 3.0006 9/26/2016 0:00
EUR 4.4504 9/26/2016 0:00
GBP 5.1213 9/26/2016 0:00
USD 3.9589 9/26/2016 0:00
CAD 3.0397 9/23/2016 0:00
EUR 4.447 9/23/2016 0:00

We want to transform the CurrencyID into columns( i.e. a column with CAD, other with EUR, other with GBP and so on) .

We can utilize R from SqlServer (https://msdn.microsoft.com/en-us/library/mt604885.aspx )

We have 2 prerequisites;

1. R is running inside SqlServer

2. The library tidyR is downloaded

( see the final of this post to verify)

Then we execute the following script:

 

 

execute sp_execute_exterhttp://msprogrammer.serviciipeweb.ro/wp-admin/widgets.phpnal_script
@language = N'R'
, @script = N'
library(tidyr);
model&lt;-spread(InputDataSet, CurrencyID, CurrencyValue);
OutputDataSet &lt;- model'
, @input_data_1 = N'select   CurrencyID, CurrencyValue, ValueDate  from CurrencyValues '

WITH RESULT SETS undefined

The output is this:

(No column name) (No column name) (No column name) (No column name) (No column name)
00:00.0 3.0397 4.447 NULL NULL
00:00.0 3.0006 4.4504 5.1213 3.9589
00:00.0 2.9929 4.4495 5.1205 3.9567

The problem is that we have columns without names (spread returns data frame columns names, but SqlServer does not … yet, I hope)

So we add the column names:

execute sp_execute_external_script
@language = N'R'
, @script = N'
library(tidyr);
model&lt;-spread(InputDataSet, CurrencyID, CurrencyValue);
cols&lt;-paste(colnames(model),collapse = ",");
model&lt;-cbind(ColumnNames = cols, model);

OutputDataSet &lt;- model'
, @input_data_1 = N'select   CurrencyID, CurrencyValue, ValueDate  from CurrencyValues '

WITH RESULT SETS undefined

and  the result is slightly better:

(No column name) (No column name) (No column name) (No column name) (No column name) (No column name)
ValueDate,CAD,EUR,GBP,USD 00:00.0 3.0397 4.447 NULL NULL
ValueDate,CAD,EUR,GBP,USD 00:00.0 3.0006 4.4504 5.1213 3.9589
ValueDate,CAD,EUR,GBP,USD 00:00.0 2.9929 4.4495 5.1205 3.9567

What I want? Sql Server should return column names in data frame by default …

And that is all  –  execute sp_execute_external_script have the possibility of parameters, so you can make general.

End!

——————————————————————-

Note 1: R is running inside SqlServer

Verify

exec sp_execute_external_script  
  @language =N'R',    
  @script=N'OutputDataSet&lt;-InputDataSet',      
  @input_data_1 =N'select 1 as hello'    
  with result sets (([hello] int not null));    
go

If it gives error , then follow https://msdn.microsoft.com/en-us/library/mt696069.aspx  – basic is

Exec sp_configure  'external scripts enabled', 1  
Reconfigure  with  override

and restart SqlServer

 

Note 2: Installing package tidyr

Follow https://msdn.microsoft.com/en-us/library/mt591989.aspx : basic is starting R with admin rights and executing

lib.SQL <- "C:\\Program Files\\Microsoft SQL Server\\MSSQL13.MSSQLSERVER\\R_SERVICES\\library"
install.packages("tidyr", lib = lib.SQL)

and that will be all ( ok, you may find some additional libraries to install)

Forecasting EUR–BNR currency exchange

Part 1: Starting with R – installation : http://msprogrammer.serviciipeweb.ro/2016/10/03/starting-with-r-as-in-sqlserver-r-server-and-vs-package/

Part 2: Starting with R – language and data http://msprogrammer.serviciipeweb.ro/2016/10/10/starting-with-r-language-and-data/

Part 3: Starting with R – basic and statistics http://msprogrammer.serviciipeweb.ro/2016/10/17/starting-with-r-basic-and-statistics/

Part 4: Making diagrams with R ( and transformation of data) : http://msprogrammer.serviciipeweb.ro/2016/10/31/making-diagrams-with-r/

Part 5:  Forecasting currency exchange with R   http://msprogrammer.serviciipeweb.ro/2016/11/06/forecasting-eurbnr-currency-exchange/

Let’s suppose that we have data from previous post

The form of the data was:

head(EUR,10)

      EUR                Data   RealDate Year
1  4.4629 2016-03-09 00:00:00 2016-03-09 2016
2  4.4624 2016-03-11 00:00:00 2016-03-11 2016
3  4.4705 2016-03-15 00:00:00 2016-03-15 2016
4  4.4775 2016-03-16 00:00:00 2016-03-16 2016
5  4.4765 2016-03-17 00:00:00 2016-03-17 2016
6  4.4694 2016-03-22 00:00:00 2016-03-22 2016
7  4.4641 2016-03-23 00:00:00 2016-03-23 2016
8  4.4655 2016-03-24 00:00:00 2016-03-24 2016
9  4.4639 2016-03-25 00:00:00 2016-03-25 2016
10 4.4619 2016-03-28 00:00:00 2016-03-28 2016

After reading materials

http://www.statoek.wiso.uni-goettingen.de/veranstaltungen/zeitreihen/sommer03/ts_r_intro.pdf

http://a-little-book-of-r-for-time-series.readthedocs.io/en/latest/src/timeseries.html

https://www.otexts.org/fpp/8/7

http://stats.stackexchange.com/questions/14742/auto-arima-with-daily-data-how-to-capture-seasonality-periodicity

https://www.r-bloggers.com/time-series-analysis-using-r-forecast-package/

http://www.statmethods.net/advstats/timeseries.html

and understanding that time series is not for beginners, I decide to find R packages that can do the job for me “automatically” . I mean, I give the pervious data –give me the forecast

So this link proved to be useful: http://blog.revolutionanalytics.com/2013/06/learning-time-series-with-r.html 

So this is the code  to importing forecast package( to forecast) and xts – extended time series – to can specify dates .

install.packages("forecast")
install.packages("xts")
library("forecast")
library("xts")

Then the real code:

myts <- xts(eur$EUR, eur$RealDate)

To see the data

head(myts,3)

and the result:

             [,1]
1999-01-04 1.3062
1999-01-05 1.3169
1999-01-06 1.3155

Data from backwards

tail(myts,3)

and the result

            [,1]
2016-09-23 4.4470
2016-09-26 4.4504
2016-09-27 4.4495

Now to the forecasting: tried to generate a model with auto.arima
fit<- auto.arima(myts)

auto.arima result was a warning also

Warning message:
In auto.arima(myts) :
  Unable to fit final model using maximum likelihood. AIC value approximated

So I tried

fit <- ets(myts)

that worked without warnings.

So let’s try a forecast for 5 days:

forecast(fit,5)

and the result

   Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
3835       4.449745 4.428025 4.471464 4.416527 4.482962
3836       4.449989 4.419210 4.480769 4.402916 4.497062
3837       4.450234 4.412485 4.487983 4.392502 4.507966
3838       4.450478 4.406836 4.494120 4.383734 4.517223
3839       4.450723 4.401874 4.499572 4.376014 4.525431

( below the arima result

     Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
3835       4.449287 4.427748 4.470825 4.416347 4.482226
3836       4.448930 4.416600 4.481259 4.399485 4.498374
3837       4.448847 4.409238 4.488455 4.388270 4.509423
3838       4.448763 4.403960 4.493565 4.380244 4.517282
3839       4.448666 4.399278 4.498054 4.373133 4.524198

)

And this is the situation

RealValues

(RV)

ETS

Dif ETS

(ETS-RV)/RV*100

arima

Dif Arima

(A-RV)/RV*100

4.4493

4.449745

0.0100%

4.449287

0.0003%

4.4514

4.449989

0.0317%

4.448930

0.0555%

4.4523

4.450234

0.0464%

4.448847

0.0776%

4.4484

4.450478

0.0467%

4.448763

0.0082%

4.4576

4.450723

0.1543%

4.448666

0.2004%

Apparently , ETS is pretty ok ( with the exception of first result)