Category: r

Fertility Rate on European Countries with R

This is the source from https://notebooks.azure.com/ignatandrei/libraries/EuropeStatData – FertitlityRate

Preload packages

Pacman

First install pacman to install once

In [1]:
if (!require("pacman")) install.packages("pacman")
              

Loading required package: pacman
              

Load the necessary packages

In [2]:
pacman::p_load('XML', "magrittr","ggplot2", "RCurl", "rlist", "rvest", "dplyr", "devtools","assertthat","XLConnect","tidyr")
              

Use the functions in the packages

In [3]:
library(stringr)
library(magrittr)
library(XML)
library(RCurl)
library(rlist)
library(ggplot2)
library(rvest)
library(tidyr)
              

Primary data

Download fertility rate for regions / counties

In [4]:
theUrl<-"http://ec.europa.eu/eurostat/web/products-datasets/-/demo_r_frate3"
content <- read_html(theUrl)
urlsDownload <- html_nodes(x = content, xpath = '//a') %>% # find tags a
                 html_attr("href") %>% # find href attribute\n",
                .[grepl(glob2rx("*BulkDownload*"), .)]  # glob2rx  - transform regular expression\n",

#print(urlsDownload)\n",
assert_that(length(urlsDownload) ==1, msg= paste("urlsDownload must have length 1! real length", length(urlsDownload)))
urlFile<-urlsDownload[1]
print(paste("Downloading from ",urlFile))
              

TRUE

[1] "Downloading from  http://ec.europa.eu/eurostat/estat-navtree-portlet-prod/BulkDownloadListing?file=data/demo_r_frate3.tsv.gz"
              

Now download file from {{theUrl}} in order to see it

In [5]:
#print(getwd())
dest<-file.path(getwd(),"file.gz")
download.file(urlFile,dest, mode = "wb", cacheOK = F) #mode binary\n",
assert_that(file.exists(dest))
              

TRUE

Read data

In [6]:
data<-as.data.frame(read.table(gzfile(dest),fill = TRUE, header=T) )
keeps <- c("unit.geo.time", "X2015")
print(head(data))
              

  unit.geo.time X2015 X2014
1         NR,AL  1.67  1.79
2        NR,AL0  1.67  1.76
3         NR,AT  1.50  1.47
4        NR,AT1  1.45  1.43
5       NR,AT11  1.37  1.33
6      NR,AT111  1.39  1.39
              

Tidying the data

Split the first colum

In [7]:
data<- data[keeps] %>%
 separate(unit.geo.time, c("unit", "geo"), ",")
      
print(head(data))
              

  unit   geo X2015
1   NR    AL  1.67
2   NR   AL0  1.67
3   NR    AT  1.50
4   NR   AT1  1.45
5   NR  AT11  1.37
6   NR AT111  1.39
              

Dictionary of metadata

Download dictionary

In [8]:
theUrlNuts<-"http://ec.europa.eu/eurostat/ramon/documents/nuts/NUTS_2013.zip"
destNuts<-file.path(getwd(),"nuts.zip")
download.file(theUrlNuts,destNuts, mode = "wb", cacheOK = F) #mode binary\n"
assert_that(file.exists(destNuts))
              

TRUE

Reading into memory

In [9]:
nuts3<-as.data.frame(readWorksheetFromFile(unzip(destNuts),sheet = 1))
              

Tidy the data

In [10]:
keeps <- c("NUTS.CODE", "NUTS.LABEL","NUTS.LEVEL","COUNTRY.CODE")
nuts3<- nuts3[keeps]
print(head(nuts3))
              

  NUTS.CODE
1        BE
2       BE1
3      BE10
4     BE100
5       BE2
6      BE21
                                                            NUTS.LABEL
1                                              BELGIQUE-BELGI<U+00CB> 
2 R<U+00C9>GION DE BRUXELLES-CAPITALE / BRUSSELS HOOFDSTEDELIJK GEWEST
3 R<U+00E9>gion de Bruxelles-Capitale / Brussels Hoofdstedelijk Gewest
4              Arr. de Bruxelles-Capitale / Arr. van Brussel-Hoofdstad
5                                                        VLAAMS GEWEST
6                                                      Prov. Antwerpen
  NUTS.LEVEL COUNTRY.CODE
1          0           BE
2          1           BE
3          2           BE
4          3           BE
5          1           BE
6          2           BE
              

Merging the data with metadata

In [11]:
allData <- merge(nuts3,data,by.x="NUTS.CODE",by.y="geo")
allData$Val <- with(allData, as.numeric(as.character(X2015)))
print(head(allData))
              

  NUTS.CODE           NUTS.LABEL NUTS.LEVEL COUNTRY.CODE unit X2015  Val
1        AT    <U+00D6>STERREICH          0           AT   NR  1.50 1.50
2       AT1 OST<U+00D6>STERREICH          1           AT   NR  1.45 1.45
3      AT11          Burgenland           2           AT   NR  1.37 1.37
4     AT111     Mittelburgenland          3           AT   NR  1.39 1.39
5     AT112       Nordburgenland          3           AT   NR  1.36 1.36
6     AT113 S<U+00FC>dburgenland          3           AT   NR  1.38 1.38
              

Data for countries

Filtering data

In [12]:
allCountries <- allData %>%
     filter(NUTS.LEVEL==0) %>%
     select(NUTS.CODE,Val) %>%
    as.data.frame

print(allCountries)
              

   NUTS.CODE  Val
1         AT 1.50
2         BE 1.70
3         BG 1.53
4         CY 1.33
5         CZ 1.57
6         DE 1.50
7         DK 1.70
8         EE 1.58
9         EL 1.33
10        ES 1.34
11        FI 1.65
12        FR 1.96
13        HR 1.40
14        HU 1.45
15        IE 1.94
16        IT 1.35
17        LT 1.71
18        LU 1.47
19        LV 1.71
20        MT 1.45
21        NL 1.66
22        PL 1.33
23        PT 1.30
24        RO 1.58
25        SE 1.85
26        SI 1.58
27        SK 1.41
28        UK 1.81
              

Show data in graphic form

In [13]:
sapply(allCountries, mode)
png(filename="allCountries.png", width=600, height=600)
ggplot(allCountries, aes(x=reorder(NUTS.CODE,-Val), y=Val))+geom_point()
dev.off()
              

NUTS.CODE
‘character’
Val
‘numeric’

png: 2

Image for all countries

All Countries

Display data for a chosen country

Chosing the country

In [14]:
countryCode <- "RO"
              

Filtering the data

Filtering data for {{countryCode}}

In [15]:
#head(allData)
Country<-allData %>% 
filter(COUNTRY.CODE==CountryCode ) %>%
filter(NUTS.LEVEL==3) %>%
select(NUTS.LABEL, NUTS.CODE, Val)

Country <- as.data.frame(Country)
#print(Country)
              

Error in filter_impl(.data, quo): Evaluation error: object 'CountryCode' not found.
Traceback:

1. allData %>% filter(COUNTRY.CODE == CountryCode) %>% filter(NUTS.LEVEL == 
 .     3) %>% select(NUTS.LABEL, NUTS.CODE, Val)
2. withVisible(eval(quote(`_fseq`(`_lhs`)), env, env))
3. eval(quote(`_fseq`(`_lhs`)), env, env)
4. eval(quote(`_fseq`(`_lhs`)), env, env)
5. `_fseq`(`_lhs`)
6. freduce(value, `_function_list`)
7. function_list[[i]](value)
8. filter(., COUNTRY.CODE == CountryCode)
9. filter.data.frame(., COUNTRY.CODE == CountryCode)
10. as.data.frame(filter(tbl_df(.data), ...))
11. filter(tbl_df(.data), ...)
12. filter.tbl_df(tbl_df(.data), ...)
13. filter_impl(.data, quo)

Show data in graphic form

In [ ]:
#sapply(Country, mode)
file<- paste('Country',CountryCode,'.png', sep='')
print(file)
png(filename=file , width=600, height=600)
ggplot(Country, aes(x=Val, y=reorder(NUTS.LABEL,-Val)))+geom_point()
dev.off()
              

Show graphic about country

alt text

Download large files with R and RCurl

First, read documentation. Try with

#getCurlOptionTypes(opts = getCurlOptionsConstants())
#print (getCurlOptionTypes)
#listCurlOptions()
#getCurlOptionsConstants()
#getCurlOptionTypes(opts = getCurlOptionsConstants())

Then, this is the code ( pretty self explanatory)

downloadFile <-function(url, file){
      curl = getCurlHandle()
curlSetOpt( .opts = list(http.version =HTTP_VERSION_1_0, verbose = TRUE),
                curl = curl)
    
     f = CFILE(file, mode=”wb”)
     a = curlPerform(url = url,curl=curl, writedata = f@ref, noprogress=FALSE)
     close(f)

}

Making IsThisTaxiLegal–evolving from Proof of Concept

What we need first is automating some tasks – like downloading the authorizations pdf ( and parsing after) from http://www.pmb.ro/adrese_utile/transport_urban/autorizatii_taxi/autorizatii_TAXI.php .

For this I want to use R – because C# I know already and is a good occasion to learn R.

The file is on https://github.com/ignatandrei/IsThisTaxiLegal/blob/master/applications/RDownloadFile/Script.R 

Some ideas from here from a C# developer:

  • Very easy to work with R – as soon as you grasp the concepts
  • Nuget from C# = CRAN from R
  • Does not provide in VS something like nuget, but pacman is perfect

if (!require(“pacman”)) install.packages(“pacman”)
pacman::p_load(‘XML’, “magrittr”, “RCurl”, “rlist”, “rvest”, “pdftools”, “dplyr”, “devtools”)
#if (!require(XML)) install.packages(‘XML’)
#install.packages(“magrittr”)
#install.packages(“RCurl”)
#install.packages(“rlist”)
#install.packages(“rvest”)
library(stringr)
library(magrittr)
library(XML)
library(RCurl)
library(rlist)
library(rvest)

  • R is supports normal programing and  functional programming  and magrittr helps

Code example:

print(head(splitData))  # normal

splitData %>% head(10) %>% print # showing data # function

  • Some WTF ( paste function is concatenate strings  ,  return in function must have () )
  • Many , many WTF when you try to make something in R until you grasp it. ( partially)
  • CRAN have many libraries free – someones requires java on your PC …. C# has a good point here , because he has now Linux + Mac + Windows natively and self contained

You will find code at https://github.com/ignatandrei/IsThisTaxiLegal/blob/master/applications/RDownloadFile/Script.R

Pivot / Transform rows from one column into multiple columns with R in SqlServer

Let’s say we have a table  [CurrencyValues] with the following datas

CurrencyID CurrencyValue ValueDate
CAD 2.9929 9/27/2016 0:00
EUR 4.4495 9/27/2016 0:00
GBP 5.1205 9/27/2016 0:00
USD 3.9567 9/27/2016 0:00
CAD 3.0006 9/26/2016 0:00
EUR 4.4504 9/26/2016 0:00
GBP 5.1213 9/26/2016 0:00
USD 3.9589 9/26/2016 0:00
CAD 3.0397 9/23/2016 0:00
EUR 4.447 9/23/2016 0:00

We want to transform the CurrencyID into columns( i.e. a column with CAD, other with EUR, other with GBP and so on) .

We can utilize R from SqlServer (https://msdn.microsoft.com/en-us/library/mt604885.aspx )

We have 2 prerequisites;

1. R is running inside SqlServer

2. The library tidyR is downloaded

( see the final of this post to verify)

Then we execute the following script:

 

 

execute sp_execute_exterhttp://msprogrammer.serviciipeweb.ro/wp-admin/widgets.phpnal_script
@language = N'R'
, @script = N'
library(tidyr);
model&lt;-spread(InputDataSet, CurrencyID, CurrencyValue);
OutputDataSet &lt;- model'
, @input_data_1 = N'select   CurrencyID, CurrencyValue, ValueDate  from CurrencyValues '

WITH RESULT SETS undefined

The output is this:

(No column name) (No column name) (No column name) (No column name) (No column name)
00:00.0 3.0397 4.447 NULL NULL
00:00.0 3.0006 4.4504 5.1213 3.9589
00:00.0 2.9929 4.4495 5.1205 3.9567

The problem is that we have columns without names (spread returns data frame columns names, but SqlServer does not … yet, I hope)

So we add the column names:

execute sp_execute_external_script
@language = N'R'
, @script = N'
library(tidyr);
model&lt;-spread(InputDataSet, CurrencyID, CurrencyValue);
cols&lt;-paste(colnames(model),collapse = ",");
model&lt;-cbind(ColumnNames = cols, model);

OutputDataSet &lt;- model'
, @input_data_1 = N'select   CurrencyID, CurrencyValue, ValueDate  from CurrencyValues '

WITH RESULT SETS undefined

and  the result is slightly better:

(No column name) (No column name) (No column name) (No column name) (No column name) (No column name)
ValueDate,CAD,EUR,GBP,USD 00:00.0 3.0397 4.447 NULL NULL
ValueDate,CAD,EUR,GBP,USD 00:00.0 3.0006 4.4504 5.1213 3.9589
ValueDate,CAD,EUR,GBP,USD 00:00.0 2.9929 4.4495 5.1205 3.9567

What I want? Sql Server should return column names in data frame by default …

And that is all  –  execute sp_execute_external_script have the possibility of parameters, so you can make general.

End!

——————————————————————-

Note 1: R is running inside SqlServer

Verify

exec sp_execute_external_script  
  @language =N'R',    
  @script=N'OutputDataSet&lt;-InputDataSet',      
  @input_data_1 =N'select 1 as hello'    
  with result sets (([hello] int not null));    
go

If it gives error , then follow https://msdn.microsoft.com/en-us/library/mt696069.aspx  – basic is

Exec sp_configure  'external scripts enabled', 1  
Reconfigure  with  override

and restart SqlServer

 

Note 2: Installing package tidyr

Follow https://msdn.microsoft.com/en-us/library/mt591989.aspx : basic is starting R with admin rights and executing

lib.SQL <- "C:\\Program Files\\Microsoft SQL Server\\MSSQL13.MSSQLSERVER\\R_SERVICES\\library"
install.packages("tidyr", lib = lib.SQL)

and that will be all ( ok, you may find some additional libraries to install)

Forecasting EUR–BNR currency exchange

Part 1: Starting with R – installation : http://msprogrammer.serviciipeweb.ro/2016/10/03/starting-with-r-as-in-sqlserver-r-server-and-vs-package/

Part 2: Starting with R – language and data http://msprogrammer.serviciipeweb.ro/2016/10/10/starting-with-r-language-and-data/

Part 3: Starting with R – basic and statistics http://msprogrammer.serviciipeweb.ro/2016/10/17/starting-with-r-basic-and-statistics/

Part 4: Making diagrams with R ( and transformation of data) : http://msprogrammer.serviciipeweb.ro/2016/10/31/making-diagrams-with-r/

Part 5:  Forecasting currency exchange with R   http://msprogrammer.serviciipeweb.ro/2016/11/06/forecasting-eurbnr-currency-exchange/

Let’s suppose that we have data from previous post

The form of the data was:

head(EUR,10)

      EUR                Data   RealDate Year
1  4.4629 2016-03-09 00:00:00 2016-03-09 2016
2  4.4624 2016-03-11 00:00:00 2016-03-11 2016
3  4.4705 2016-03-15 00:00:00 2016-03-15 2016
4  4.4775 2016-03-16 00:00:00 2016-03-16 2016
5  4.4765 2016-03-17 00:00:00 2016-03-17 2016
6  4.4694 2016-03-22 00:00:00 2016-03-22 2016
7  4.4641 2016-03-23 00:00:00 2016-03-23 2016
8  4.4655 2016-03-24 00:00:00 2016-03-24 2016
9  4.4639 2016-03-25 00:00:00 2016-03-25 2016
10 4.4619 2016-03-28 00:00:00 2016-03-28 2016

After reading materials

http://www.statoek.wiso.uni-goettingen.de/veranstaltungen/zeitreihen/sommer03/ts_r_intro.pdf

http://a-little-book-of-r-for-time-series.readthedocs.io/en/latest/src/timeseries.html

https://www.otexts.org/fpp/8/7

http://stats.stackexchange.com/questions/14742/auto-arima-with-daily-data-how-to-capture-seasonality-periodicity

https://www.r-bloggers.com/time-series-analysis-using-r-forecast-package/

http://www.statmethods.net/advstats/timeseries.html

and understanding that time series is not for beginners, I decide to find R packages that can do the job for me “automatically” . I mean, I give the pervious data –give me the forecast

So this link proved to be useful: http://blog.revolutionanalytics.com/2013/06/learning-time-series-with-r.html 

So this is the code  to importing forecast package( to forecast) and xts – extended time series – to can specify dates .

install.packages("forecast")
install.packages("xts")
library("forecast")
library("xts")

Then the real code:

myts <- xts(eur$EUR, eur$RealDate)

To see the data

head(myts,3)

and the result:

             [,1]
1999-01-04 1.3062
1999-01-05 1.3169
1999-01-06 1.3155

Data from backwards

tail(myts,3)

and the result

            [,1]
2016-09-23 4.4470
2016-09-26 4.4504
2016-09-27 4.4495

Now to the forecasting: tried to generate a model with auto.arima
fit<- auto.arima(myts)

auto.arima result was a warning also

Warning message:
In auto.arima(myts) :
  Unable to fit final model using maximum likelihood. AIC value approximated

So I tried

fit <- ets(myts)

that worked without warnings.

So let’s try a forecast for 5 days:

forecast(fit,5)

and the result

   Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
3835       4.449745 4.428025 4.471464 4.416527 4.482962
3836       4.449989 4.419210 4.480769 4.402916 4.497062
3837       4.450234 4.412485 4.487983 4.392502 4.507966
3838       4.450478 4.406836 4.494120 4.383734 4.517223
3839       4.450723 4.401874 4.499572 4.376014 4.525431

( below the arima result

     Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
3835       4.449287 4.427748 4.470825 4.416347 4.482226
3836       4.448930 4.416600 4.481259 4.399485 4.498374
3837       4.448847 4.409238 4.488455 4.388270 4.509423
3838       4.448763 4.403960 4.493565 4.380244 4.517282
3839       4.448666 4.399278 4.498054 4.373133 4.524198

)

And this is the situation

RealValues

(RV)

ETS

Dif ETS

(ETS-RV)/RV*100

arima

Dif Arima

(A-RV)/RV*100

4.4493

4.449745

0.0100%

4.449287

0.0003%

4.4514

4.449989

0.0317%

4.448930

0.0555%

4.4523

4.450234

0.0464%

4.448847

0.0776%

4.4484

4.450478

0.0467%

4.448763

0.0082%

4.4576

4.450723

0.1543%

4.448666

0.2004%

Apparently , ETS is pretty ok ( with the exception of first result) 

Making diagrams with R

Part 1: Starting with R – installation : http://msprogrammer.serviciipeweb.ro/2016/10/03/starting-with-r-as-in-sqlserver-r-server-and-vs-package/

Part 2: Starting with R – language and data http://msprogrammer.serviciipeweb.ro/2016/10/10/starting-with-r-language-and-data/

Part 3: Starting with R – basic and statistics http://msprogrammer.serviciipeweb.ro/2016/10/17/starting-with-r-basic-and-statistics/

Part 4: Making diagrams with R ( and transformation of data) : http://msprogrammer.serviciipeweb.ro/2016/10/31/making-diagrams-with-r/

Part 5:  Forecasting currency exchange with R   http://msprogrammer.serviciipeweb.ro/2016/11/06/forecasting-eurbnr-currency-exchange/

I have a site that list currency exchange data from National Bank of Romania ( www.infovalutar.ro ).

I was having a problem about how looks EUR data for each year that I recorded

My data was in a table structured like this

IDMoneda,Valoare,DataValorii

USD,0.8121,1998-01-05 00:00:00
USD,0.8203,1998-01-06 00:00:00
USD,0.8236,1998-01-07 00:00:00
USD,0.8269,1998-01-08 00:00:00
USD,0.836,1998-01-09 00:00:00
USD,0.842,1998-01-12 00:00:00
USD,0.8415,1998-01-13 00:00:00
USD,0.8407,1998-01-14 00:00:00
USD,0.8418,1998-01-15 00:00:00
USD,0.8423,1998-01-16 00:00:00
USD,0.8418,1998-01-19 00:00:00

….

EUR,1.3062,1999-01-04 00:00:00
EUR,1.3169,1999-01-05 00:00:00
EUR,1.3155,1999-01-06 00:00:00
EUR,1.3112,1999-01-07 00:00:00
EUR,1.3142,1999-01-08 00:00:00
EUR,1.3053,1999-01-11 00:00:00
EUR,1.2872,1999-01-12 00:00:00
EUR,1.3109,1999-01-13 00:00:00
EUR,1.3093,1999-01-14 00:00:00
EUR,1.3162,1999-01-15 00:00:00
EUR,1.3192,1999-01-18 00:00:00
EUR,1.3326,1999-01-19 00:00:00

….

And so on . The first column, IDMoneda, it is a ID of the currency (USD , EUR). The second, Valoare, is the Value. The third, DataValorii, is the value date( You can find all the data at https://1drv.ms/u/s!Aordxu1LWKDZgf1i0-y8cDIDoKNx2w  as a 7z file)

The task that I want is to have a diagram with the values for each year for EUR , in order to figure how it is goinf.

I have started small, with importing data:

BNR <- read.csv(file= "e:\\bnr.csv")

Then I have display the data

head(bnr,12)

Ok. So far so good. But I need the data in this form

EUR, DataValorii ( DateValue)

1.3062,1999-01-04 00:00:00

1.3169,1999-01-05 00:00:00

and EUR is on the row, not on the column. In SqlServer it is called PIVOT(https://technet.microsoft.com/en-us/library/ms177410(v=sql.105).aspx)

In R I have discovered the following link http://garrettgman.github.io/tidying/ and a library:

install.packages(‘tidyr’)

library(tidyr)

And then I have executed

 bnr1<-spread(BNR, IDMoneda, Valoare)

head(bnr1,12)

and voila – it is pivoting over the IDMoneda column

DataValorii AED    ATS    AUD    BEF  ….

I want to exclude NA values ( EUR was not from the first date, so some rows with EUR are nulls)

eur <-na.omit( data.frame(bnr1$EUR, bnr1$DataValorii))

And want to put better names of the columns

colnames(eur)[1] <- "EUR"
colnames(eur)[2] <- "Data"

Now we can see

head(eur,10)

And now in dataframe eur I have

  EUR                Data  
256 1.3062 1999-01-04 00:00:00
257 1.3169 1999-01-05 00:00:00
258 1.3155 1999-01-06 00:00:00
259 1.3112 1999-01-07 00:00:00
260 1.3142 1999-01-08 00:00:00
261 1.3053 1999-01-11 00:00:00
262 1.2872 1999-01-12 00:00:00
263 1.3109 1999-01-13 00:00:00
264 1.3093 1999-01-14 00:00:00
265 1.3162 1999-01-15 00:00:00

Now I want to transform second column into a date ( we can look with str(eur) to the structure of the dataframe)

eur$RealDate<- as.Date(eur$Data)

And add a year column

eur$Year<-as.numeric(format(eur$RealDate,’%Y’))

Now

head(eur, 10)

looks like

    EUR                Data   RealDate Year
256 1.3062 1999-01-04 00:00:00 1999-01-04 1999
257 1.3169 1999-01-05 00:00:00 1999-01-05 1999
258 1.3155 1999-01-06 00:00:00 1999-01-06 1999
259 1.3112 1999-01-07 00:00:00 1999-01-07 1999
260 1.3142 1999-01-08 00:00:00 1999-01-08 1999
261 1.3053 1999-01-11 00:00:00 1999-01-11 1999
262 1.2872 1999-01-12 00:00:00 1999-01-12 1999
263 1.3109 1999-01-13 00:00:00 1999-01-13 1999
264 1.3093 1999-01-14 00:00:00 1999-01-14 1999
265 1.3162 1999-01-15 00:00:00 1999-01-15 1999

And the structure

str(eur)

looks like

‘data.frame’:   3834 obs. of  4 variables:
$ EUR     : num  1.31 1.32 1.32 1.31 1.31 …
  $ Data    : Factor w/ 4119 levels "1998-01-05 00:00:00",..: 256 257 258 259 260 261 262 263 264 265 …
  $ RealDate: Date, format: "1999-01-04" "1999-01-05" "1999-01-06" "1999-01-07" …
  $ Year    : num  1999 1999 1999 1999 1999 …
  – attr(*, "na.action")=Class ‘omit’  Named int [1:285] 1 2 3 4 5 6 7 8 9 10 …
  .. ..- attr(*, "names")= chr [1:285] "1" "2" "3" "4" …

Now we want to filter the data based on the column Year, for example, for this year:

install.packages("dplyr")
library("dplyr")

thisYear= as.numeric(format(Sys.Date(),’%Y’))

eurThisYear <-filter(eur,Year == thisYear)

head(eurThisYear,10)

And the result is

      EUR                Data   RealDate Year
1  4.4629 2016-03-09 00:00:00 2016-03-09 2016
2  4.4624 2016-03-11 00:00:00 2016-03-11 2016
3  4.4705 2016-03-15 00:00:00 2016-03-15 2016
4  4.4775 2016-03-16 00:00:00 2016-03-16 2016
5  4.4765 2016-03-17 00:00:00 2016-03-17 2016
6  4.4694 2016-03-22 00:00:00 2016-03-22 2016
7  4.4641 2016-03-23 00:00:00 2016-03-23 2016
8  4.4655 2016-03-24 00:00:00 2016-03-24 2016
9  4.4639 2016-03-25 00:00:00 2016-03-25 2016
10 4.4619 2016-03-28 00:00:00 2016-03-28 2016

Now we want to extract the distinct years from column YEAR from dataframe eur and for each year generate a plot

install.packages(‘foreach’)
library(‘foreach’)

foreach(n = unique(eur$Year)) %do% {
  fileName= paste(n,"BNR_img.jpg",sep="_")
  eurThisYear <-filter(eur,Year == n)
plot(eurThisYear$RealDate, eurThisYear$EUR ,main=paste("EUR in year",n,sep=" "), sub="National Bank of Romania",       xlab="Months", ylab="Value 1 EUR in RON")
  dev.copy(jpeg,filename=fileName)
  dev.off()
}

The results can be found here https://1drv.ms/f/s!Aordxu1LWKDZgf1h_iMeIEbEWhgNtQ 

or here: https://1drv.ms/a/s!Aordxu1LWKDZgf13J4aeGHLEo2sIvQ

Total time spent (without knowing R, but just searching help) : 1 hour.

R is a wonderfull data scripting !

Starting with R- basic and statistics

Part 1: Starting with R – installation : http://msprogrammer.serviciipeweb.ro/2016/10/03/starting-with-r-as-in-sqlserver-r-server-and-vs-package/

Part 2: Starting with R – language and data http://msprogrammer.serviciipeweb.ro/2016/10/10/starting-with-r-language-and-data/

Part 3: Starting with R – basic and statistics http://msprogrammer.serviciipeweb.ro/2016/10/17/starting-with-r-basic-and-statistics/

Part 4: Making diagrams with R ( and transformation of data) : http://msprogrammer.serviciipeweb.ro/2016/10/31/making-diagrams-with-r/

Part 5:  Forecasting currency exchange with R   http://msprogrammer.serviciipeweb.ro/2016/11/06/forecasting-eurbnr-currency-exchange/

Read http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.723.6582&rep=rep1&type=pdf  ‘Learning Statistics with R: A tutorial for psychology students and other beginners.’ Author: Navarro   , a 500 pages book about R, statistics and diagram.

Take a week and understand.

Starting with R – language and data

Part 1: Starting with R – installation : http://msprogrammer.serviciipeweb.ro/2016/10/03/starting-with-r-as-in-sqlserver-r-server-and-vs-package/

Part 2: Starting with R – language and data http://msprogrammer.serviciipeweb.ro/2016/10/10/starting-with-r-language-and-data/

Part 3: Starting with R – basic and statistics http://msprogrammer.serviciipeweb.ro/2016/10/17/starting-with-r-basic-and-statistics/

Part 4: Making diagrams with R ( and transformation of data) : http://msprogrammer.serviciipeweb.ro/2016/10/31/making-diagrams-with-r/

Part 5:  Forecasting currency exchange with R   http://msprogrammer.serviciipeweb.ro/2016/11/06/forecasting-eurbnr-currency-exchange/

The http://tryr.codeschool.com/ also features a gentle introduction to R – if you do not know programming  .

As in other programming languages, you can store values into a variable to access it later. Type x <- 42 to store a value in x.

x <- 42

I have typed :

> x<-7

And site answers by:

`x` should be set to the number `42`. Try again!  # this is the site answer ….

> x<-42 # made by me

( and site goes to next exercise – although you can trick to strings …)

The next  best tutorial ( comprised on six parts) I have found here:

http://www.computerworld.com/article/2497164/business-intelligence/business-intelligence-beginner-s-guide-to-r-get-your-data-into-r.html

( did you know about data() command ?)

Anyway, those are the six parts

After that , if you have a Pluralsight subscription, this course is worth looking : http://www.pluralsight.com/courses/r-understanding-machine-learning  -  it gives you a short introduction to MachineLearning with R

Starting with R (as in SqlServer R server and VS package)

Part 1: Starting with R – installation : http://msprogrammer.serviciipeweb.ro/2016/10/03/starting-with-r-as-in-sqlserver-r-server-and-vs-package/

Part 2: Starting with R – language and data http://msprogrammer.serviciipeweb.ro/2016/10/10/starting-with-r-language-and-data/

Part 3: Starting with R – basic and statistics http://msprogrammer.serviciipeweb.ro/2016/10/17/starting-with-r-basic-and-statistics/

Part 4: Making diagrams with R ( and transformation of data) : http://msprogrammer.serviciipeweb.ro/2016/10/31/making-diagrams-with-r/

Part 5:  Forecasting currency exchange with R   http://msprogrammer.serviciipeweb.ro/2016/11/06/forecasting-eurbnr-currency-exchange/

First, start with installing Sql Server 2016 Developer ( free from https://www.microsoft.com/en-us/cloud-platform/sql-server-editions-developers )

Ensure that R is selected when installing.

Start SSMS ( Sql Server Management Studio) and ensure that R is configured and running ( https://msdn.microsoft.com/en-us/library/mt590884.aspx  or more detailed https://msdn.microsoft.com/en-us/library/mt696069.aspx )

Then install R Client from http://aka.ms/rclient/download ( see https://msdn.microsoft.com/en-us/library/mt696067.aspx ) . It features also an package for Visual Studio

Once you install R Client, start RGUI ( press Windows key and type RGUI ) .

In the window of RCLient, type getwd() ( R is case sensitive – does not work GetWD() )

( The same thing you can do in Visual Studio, from R Tools =>Windows=>R interactive)

Then you can follow the tutorial from https://msdn.microsoft.com/microsoft-r/scaler-getting-started# -  it is ok and easy to  reproduce.

Andrei Ignat weekly software news(mostly .NET)

* indicates required

Please select all the ways you would like to hear from me:

You can unsubscribe at any time by clicking the link in the footer of our emails. For information about our privacy practices, please visit our website.

We use Mailchimp as our marketing platform. By clicking below to subscribe, you acknowledge that your information will be transferred to Mailchimp for processing. Learn more about Mailchimp's privacy practices here.