Making diagrams with R

Part 1: Starting with R – installation : http://msprogrammer.serviciipeweb.ro/2016/10/03/starting-with-r-as-in-sqlserver-r-server-and-vs-package/

Part 2: Starting with R – language and data http://msprogrammer.serviciipeweb.ro/2016/10/10/starting-with-r-language-and-data/

Part 3: Starting with R – basic and statistics http://msprogrammer.serviciipeweb.ro/2016/10/17/starting-with-r-basic-and-statistics/

Part 4: Making diagrams with R ( and transformation of data) : http://msprogrammer.serviciipeweb.ro/2016/10/31/making-diagrams-with-r/

Part 5:  Forecasting currency exchange with R   http://msprogrammer.serviciipeweb.ro/2016/11/06/forecasting-eurbnr-currency-exchange/

I have a site that list currency exchange data from National Bank of Romania ( www.infovalutar.ro ).

I was having a problem about how looks EUR data for each year that I recorded

My data was in a table structured like this

IDMoneda,Valoare,DataValorii

USD,0.8121,1998-01-05 00:00:00
USD,0.8203,1998-01-06 00:00:00
USD,0.8236,1998-01-07 00:00:00
USD,0.8269,1998-01-08 00:00:00
USD,0.836,1998-01-09 00:00:00
USD,0.842,1998-01-12 00:00:00
USD,0.8415,1998-01-13 00:00:00
USD,0.8407,1998-01-14 00:00:00
USD,0.8418,1998-01-15 00:00:00
USD,0.8423,1998-01-16 00:00:00
USD,0.8418,1998-01-19 00:00:00

….

EUR,1.3062,1999-01-04 00:00:00
EUR,1.3169,1999-01-05 00:00:00
EUR,1.3155,1999-01-06 00:00:00
EUR,1.3112,1999-01-07 00:00:00
EUR,1.3142,1999-01-08 00:00:00
EUR,1.3053,1999-01-11 00:00:00
EUR,1.2872,1999-01-12 00:00:00
EUR,1.3109,1999-01-13 00:00:00
EUR,1.3093,1999-01-14 00:00:00
EUR,1.3162,1999-01-15 00:00:00
EUR,1.3192,1999-01-18 00:00:00
EUR,1.3326,1999-01-19 00:00:00

….

And so on . The first column, IDMoneda, it is a ID of the currency (USD , EUR). The second, Valoare, is the Value. The third, DataValorii, is the value date( You can find all the data at https://1drv.ms/u/s!Aordxu1LWKDZgf1i0-y8cDIDoKNx2w  as a 7z file)

The task that I want is to have a diagram with the values for each year for EUR , in order to figure how it is goinf.

I have started small, with importing data:

BNR <- read.csv(file= "e:\\bnr.csv")

Then I have display the data

head(bnr,12)

Ok. So far so good. But I need the data in this form

EUR, DataValorii ( DateValue)

1.3062,1999-01-04 00:00:00

1.3169,1999-01-05 00:00:00

and EUR is on the row, not on the column. In SqlServer it is called PIVOT(https://technet.microsoft.com/en-us/library/ms177410(v=sql.105).aspx)

In R I have discovered the following link http://garrettgman.github.io/tidying/ and a library:

install.packages(‘tidyr’)

library(tidyr)

And then I have executed

 bnr1<-spread(BNR, IDMoneda, Valoare)

head(bnr1,12)

and voila – it is pivoting over the IDMoneda column

DataValorii AED    ATS    AUD    BEF  ….

I want to exclude NA values ( EUR was not from the first date, so some rows with EUR are nulls)

eur <-na.omit( data.frame(bnr1$EUR, bnr1$DataValorii))

And want to put better names of the columns

colnames(eur)[1] <- "EUR"
colnames(eur)[2] <- "Data"

Now we can see

head(eur,10)

And now in dataframe eur I have

  EUR                Data  
256 1.3062 1999-01-04 00:00:00
257 1.3169 1999-01-05 00:00:00
258 1.3155 1999-01-06 00:00:00
259 1.3112 1999-01-07 00:00:00
260 1.3142 1999-01-08 00:00:00
261 1.3053 1999-01-11 00:00:00
262 1.2872 1999-01-12 00:00:00
263 1.3109 1999-01-13 00:00:00
264 1.3093 1999-01-14 00:00:00
265 1.3162 1999-01-15 00:00:00

Now I want to transform second column into a date ( we can look with str(eur) to the structure of the dataframe)

eur$RealDate<- as.Date(eur$Data)

And add a year column

eur$Year<-as.numeric(format(eur$RealDate,’%Y’))

Now

head(eur, 10)

looks like

    EUR                Data   RealDate Year
256 1.3062 1999-01-04 00:00:00 1999-01-04 1999
257 1.3169 1999-01-05 00:00:00 1999-01-05 1999
258 1.3155 1999-01-06 00:00:00 1999-01-06 1999
259 1.3112 1999-01-07 00:00:00 1999-01-07 1999
260 1.3142 1999-01-08 00:00:00 1999-01-08 1999
261 1.3053 1999-01-11 00:00:00 1999-01-11 1999
262 1.2872 1999-01-12 00:00:00 1999-01-12 1999
263 1.3109 1999-01-13 00:00:00 1999-01-13 1999
264 1.3093 1999-01-14 00:00:00 1999-01-14 1999
265 1.3162 1999-01-15 00:00:00 1999-01-15 1999

And the structure

str(eur)

looks like

‘data.frame’:   3834 obs. of  4 variables:
$ EUR     : num  1.31 1.32 1.32 1.31 1.31 …
  $ Data    : Factor w/ 4119 levels "1998-01-05 00:00:00",..: 256 257 258 259 260 261 262 263 264 265 …
  $ RealDate: Date, format: "1999-01-04" "1999-01-05" "1999-01-06" "1999-01-07" …
  $ Year    : num  1999 1999 1999 1999 1999 …
  – attr(*, "na.action")=Class ‘omit’  Named int [1:285] 1 2 3 4 5 6 7 8 9 10 …
  .. ..- attr(*, "names")= chr [1:285] "1" "2" "3" "4" …

Now we want to filter the data based on the column Year, for example, for this year:

install.packages("dplyr")
library("dplyr")

thisYear= as.numeric(format(Sys.Date(),’%Y’))

eurThisYear <-filter(eur,Year == thisYear)

head(eurThisYear,10)

And the result is

      EUR                Data   RealDate Year
1  4.4629 2016-03-09 00:00:00 2016-03-09 2016
2  4.4624 2016-03-11 00:00:00 2016-03-11 2016
3  4.4705 2016-03-15 00:00:00 2016-03-15 2016
4  4.4775 2016-03-16 00:00:00 2016-03-16 2016
5  4.4765 2016-03-17 00:00:00 2016-03-17 2016
6  4.4694 2016-03-22 00:00:00 2016-03-22 2016
7  4.4641 2016-03-23 00:00:00 2016-03-23 2016
8  4.4655 2016-03-24 00:00:00 2016-03-24 2016
9  4.4639 2016-03-25 00:00:00 2016-03-25 2016
10 4.4619 2016-03-28 00:00:00 2016-03-28 2016

Now we want to extract the distinct years from column YEAR from dataframe eur and for each year generate a plot

install.packages(‘foreach’)
library(‘foreach’)

foreach(n = unique(eur$Year)) %do% {
  fileName= paste(n,"BNR_img.jpg",sep="_")
  eurThisYear <-filter(eur,Year == n)
plot(eurThisYear$RealDate, eurThisYear$EUR ,main=paste("EUR in year",n,sep=" "), sub="National Bank of Romania",       xlab="Months", ylab="Value 1 EUR in RON")
  dev.copy(jpeg,filename=fileName)
  dev.off()
}

The results can be found here https://1drv.ms/f/s!Aordxu1LWKDZgf1h_iMeIEbEWhgNtQ 

or here: https://1drv.ms/a/s!Aordxu1LWKDZgf13J4aeGHLEo2sIvQ

Total time spent (without knowing R, but just searching help) : 1 hour.

R is a wonderfull data scripting !