Year: 2018

Design Patterns Class

 

For anyone interested I propose a class on Design Patterns and their practical use in .NET Core / .NET Framework.

The tutorial  will take 2  days . After that, you will have a practical knowledge about some of Design Patterns and their practical use in any framework construction.

The design patterns shown will be:

  1. Visitor
  2. Strategy
  3. Singleton
  4. Factory
  5. Façade
  6. Dependency injection
  7. Decorator
  8. Chain of responsibility
  9. Prototype
  10. Iterator
  11. Builder
  12. Adapter

Every design pattern will have:

  1. Explanation of the design pattern
  2. Demo about their use in .NET Core / .NET Framework
  3. Code to show their use
  4. A practical exercise for the participants

For more details , please see my skype contact above

Fertility Rate on European Countries with R

This is the source from https://notebooks.azure.com/ignatandrei/libraries/EuropeStatData – FertitlityRate

Preload packages

Pacman

First install pacman to install once

In [1]:
if (!require("pacman")) install.packages("pacman")
              

Loading required package: pacman
              

Load the necessary packages

In [2]:
pacman::p_load('XML', "magrittr","ggplot2", "RCurl", "rlist", "rvest", "dplyr", "devtools","assertthat","XLConnect","tidyr")
              

Use the functions in the packages

In [3]:
library(stringr)
library(magrittr)
library(XML)
library(RCurl)
library(rlist)
library(ggplot2)
library(rvest)
library(tidyr)
              

Primary data

Download fertility rate for regions / counties

In [4]:
theUrl<-"http://ec.europa.eu/eurostat/web/products-datasets/-/demo_r_frate3"
content <- read_html(theUrl)
urlsDownload <- html_nodes(x = content, xpath = '//a') %>% # find tags a
                 html_attr("href") %>% # find href attribute\n",
                .[grepl(glob2rx("*BulkDownload*"), .)]  # glob2rx  - transform regular expression\n",

#print(urlsDownload)\n",
assert_that(length(urlsDownload) ==1, msg= paste("urlsDownload must have length 1! real length", length(urlsDownload)))
urlFile<-urlsDownload[1]
print(paste("Downloading from ",urlFile))
              

TRUE

[1] "Downloading from  http://ec.europa.eu/eurostat/estat-navtree-portlet-prod/BulkDownloadListing?file=data/demo_r_frate3.tsv.gz"
              

Now download file from {{theUrl}} in order to see it

In [5]:
#print(getwd())
dest<-file.path(getwd(),"file.gz")
download.file(urlFile,dest, mode = "wb", cacheOK = F) #mode binary\n",
assert_that(file.exists(dest))
              

TRUE

Read data

In [6]:
data<-as.data.frame(read.table(gzfile(dest),fill = TRUE, header=T) )
keeps <- c("unit.geo.time", "X2015")
print(head(data))
              

  unit.geo.time X2015 X2014
1         NR,AL  1.67  1.79
2        NR,AL0  1.67  1.76
3         NR,AT  1.50  1.47
4        NR,AT1  1.45  1.43
5       NR,AT11  1.37  1.33
6      NR,AT111  1.39  1.39
              

Tidying the data

Split the first colum

In [7]:
data<- data[keeps] %>%
 separate(unit.geo.time, c("unit", "geo"), ",")
      
print(head(data))
              

  unit   geo X2015
1   NR    AL  1.67
2   NR   AL0  1.67
3   NR    AT  1.50
4   NR   AT1  1.45
5   NR  AT11  1.37
6   NR AT111  1.39
              

Dictionary of metadata

Download dictionary

In [8]:
theUrlNuts<-"http://ec.europa.eu/eurostat/ramon/documents/nuts/NUTS_2013.zip"
destNuts<-file.path(getwd(),"nuts.zip")
download.file(theUrlNuts,destNuts, mode = "wb", cacheOK = F) #mode binary\n"
assert_that(file.exists(destNuts))
              

TRUE

Reading into memory

In [9]:
nuts3<-as.data.frame(readWorksheetFromFile(unzip(destNuts),sheet = 1))
              

Tidy the data

In [10]:
keeps <- c("NUTS.CODE", "NUTS.LABEL","NUTS.LEVEL","COUNTRY.CODE")
nuts3<- nuts3[keeps]
print(head(nuts3))
              

  NUTS.CODE
1        BE
2       BE1
3      BE10
4     BE100
5       BE2
6      BE21
                                                            NUTS.LABEL
1                                              BELGIQUE-BELGI<U+00CB> 
2 R<U+00C9>GION DE BRUXELLES-CAPITALE / BRUSSELS HOOFDSTEDELIJK GEWEST
3 R<U+00E9>gion de Bruxelles-Capitale / Brussels Hoofdstedelijk Gewest
4              Arr. de Bruxelles-Capitale / Arr. van Brussel-Hoofdstad
5                                                        VLAAMS GEWEST
6                                                      Prov. Antwerpen
  NUTS.LEVEL COUNTRY.CODE
1          0           BE
2          1           BE
3          2           BE
4          3           BE
5          1           BE
6          2           BE
              

Merging the data with metadata

In [11]:
allData <- merge(nuts3,data,by.x="NUTS.CODE",by.y="geo")
allData$Val <- with(allData, as.numeric(as.character(X2015)))
print(head(allData))
              

  NUTS.CODE           NUTS.LABEL NUTS.LEVEL COUNTRY.CODE unit X2015  Val
1        AT    <U+00D6>STERREICH          0           AT   NR  1.50 1.50
2       AT1 OST<U+00D6>STERREICH          1           AT   NR  1.45 1.45
3      AT11          Burgenland           2           AT   NR  1.37 1.37
4     AT111     Mittelburgenland          3           AT   NR  1.39 1.39
5     AT112       Nordburgenland          3           AT   NR  1.36 1.36
6     AT113 S<U+00FC>dburgenland          3           AT   NR  1.38 1.38
              

Data for countries

Filtering data

In [12]:
allCountries <- allData %>%
     filter(NUTS.LEVEL==0) %>%
     select(NUTS.CODE,Val) %>%
    as.data.frame

print(allCountries)
              

   NUTS.CODE  Val
1         AT 1.50
2         BE 1.70
3         BG 1.53
4         CY 1.33
5         CZ 1.57
6         DE 1.50
7         DK 1.70
8         EE 1.58
9         EL 1.33
10        ES 1.34
11        FI 1.65
12        FR 1.96
13        HR 1.40
14        HU 1.45
15        IE 1.94
16        IT 1.35
17        LT 1.71
18        LU 1.47
19        LV 1.71
20        MT 1.45
21        NL 1.66
22        PL 1.33
23        PT 1.30
24        RO 1.58
25        SE 1.85
26        SI 1.58
27        SK 1.41
28        UK 1.81
              

Show data in graphic form

In [13]:
sapply(allCountries, mode)
png(filename="allCountries.png", width=600, height=600)
ggplot(allCountries, aes(x=reorder(NUTS.CODE,-Val), y=Val))+geom_point()
dev.off()
              

NUTS.CODE
‘character’
Val
‘numeric’

png: 2

Image for all countries

All Countries

Display data for a chosen country

Chosing the country

In [14]:
countryCode <- "RO"
              

Filtering the data

Filtering data for {{countryCode}}

In [15]:
#head(allData)
Country<-allData %>% 
filter(COUNTRY.CODE==CountryCode ) %>%
filter(NUTS.LEVEL==3) %>%
select(NUTS.LABEL, NUTS.CODE, Val)

Country <- as.data.frame(Country)
#print(Country)
              

Error in filter_impl(.data, quo): Evaluation error: object 'CountryCode' not found.
Traceback:

1. allData %>% filter(COUNTRY.CODE == CountryCode) %>% filter(NUTS.LEVEL == 
 .     3) %>% select(NUTS.LABEL, NUTS.CODE, Val)
2. withVisible(eval(quote(`_fseq`(`_lhs`)), env, env))
3. eval(quote(`_fseq`(`_lhs`)), env, env)
4. eval(quote(`_fseq`(`_lhs`)), env, env)
5. `_fseq`(`_lhs`)
6. freduce(value, `_function_list`)
7. function_list[[i]](value)
8. filter(., COUNTRY.CODE == CountryCode)
9. filter.data.frame(., COUNTRY.CODE == CountryCode)
10. as.data.frame(filter(tbl_df(.data), ...))
11. filter(tbl_df(.data), ...)
12. filter.tbl_df(tbl_df(.data), ...)
13. filter_impl(.data, quo)

Show data in graphic form

In [ ]:
#sapply(Country, mode)
file<- paste('Country',CountryCode,'.png', sep='')
print(file)
png(filename=file , width=600, height=600)
ggplot(Country, aes(x=Val, y=reorder(NUTS.LABEL,-Val)))+geom_point()
dev.off()
              

Show graphic about country

alt text

Download large files with R and RCurl

First, read documentation. Try with

#getCurlOptionTypes(opts = getCurlOptionsConstants())
#print (getCurlOptionTypes)
#listCurlOptions()
#getCurlOptionsConstants()
#getCurlOptionTypes(opts = getCurlOptionsConstants())

Then, this is the code ( pretty self explanatory)

downloadFile <-function(url, file){
      curl = getCurlHandle()
curlSetOpt( .opts = list(http.version =HTTP_VERSION_1_0, verbose = TRUE),
                curl = curl)
    
     f = CFILE(file, mode=”wb”)
     a = curlPerform(url = url,curl=curl, writedata = f@ref, noprogress=FALSE)
     close(f)

}

Github as a repository for database (sqlite)

In making http://isthistaxilegal.apphb.com I neede to have a database. I decided to have sqlite, since it needs just a file and the dll’s.( Side Note: The good dll’s for sqlite and .net core are not the original ones, that have a dependency of .NET Framework – but Microsoft.Data.Sqlite ) . But how to deploy this every time?

Simplest way: Verify at start if the file exists, and , if not, download from GitHub

The code is

var url ="https://raw.githubusercontent.com/ignatandrei/IsThisTaxiLegal/master/datas/taxis.sqlite3";
using (var client = new HttpClient())
{

    using (var result = await client.GetAsync(url))
    {
        if (result.IsSuccessStatusCode)
        {
            var bytes=await result.Content.ReadAsByteArrayAsync();
            File.WriteAllBytes("taxis.sqlite3",bytes);
        }

    }
}

And the idea is not so bad  – having the database at github and download every time is a good idea…

Interpreting pdf to database(sqlite)

In making the http://isthistaxilegal.apphb.com I have had the need to parse pdf files, that contains tables inside. A more daunting task is hard. I have investigating the existing software – and the most used one was https://github.com/tabulapdf – however, not even him was closed to achieve . The problem was not transforming the pdf into text. The problem was transforming tables in PDF to csv or something similar. One of the reasons that is difficult is stated here: https://stackoverflow.com/questions/31579922/extracting-tables-lines-and-coordinates-from-pdf-using-pdfsharp-c :

The (my) solution
Starting from a PDF reader (iTextSharp) you need to:
1. read the lines (hopefully only vertical and horizontal lines);
2. join the lines (a line of a table could be several lines, for example one per cell);
3. understand where the tables are (sometimes making some hypothesis based on your needs);
4. optionally find the text outside the tables (better to keep all the text) and insert it in paragraphs;
5. Insert text inside the cells of the table

So I tried the old automation solution –Word. And word can read PDF files. More ,the tables are transferred as real tables! So – the next task to do is to transform Word tables into CSV.Word Macro to the rescue:

Sub ExportTables()
    Dim d As Document
    Set d = ActiveDocument
    MsgBox d.Tables.Count
    Dim t As Table
    Dim r As Row
    Dim iRow As Integer
    Dim iCell As Integer
    Dim str As String
    Dim start As Integer
    start = 1
    For Each t In d.Tables
        
        'str = str + t.ConvertToText(Separator:=wdSeparateByCommas).Text
        Dim nrRows As Integer
        nrRows = t.Rows.Count
        For iRow = start To nrRows
            
            Set r = t.Rows.Item(iRow)
            Dim nrCells As Integer
            nrCells = r.Cells.Count
            For iCell = 1 To nrCells
                str = str & Replace(Replace(Replace(r.Cells.Item(iCell).Range.Text, vbCr, ""), vbLf, ""), Chr(7), "") & "|"
            Next iCell
            
            str = str & vbCrLf
            
        Next iRow
            
        If start = 1 Then start = 2 ' do not read headers for each table
    Next t
    Open d.Path & "\Cluj.csv" For Output As #1
    Print #1, str
    Close #1
End Sub


Then from CSV to sqlite I have used sqlite3.exe with the following code

.open ../taxis.sqlite3
.import ../bucuresti.csv bucuresti
.quit

And the image shows the flow:

Andrei Ignat weekly software news(mostly .NET)

* indicates required

Please select all the ways you would like to hear from me:

You can unsubscribe at any time by clicking the link in the footer of our emails. For information about our privacy practices, please visit our website.

We use Mailchimp as our marketing platform. By clicking below to subscribe, you acknowledge that your information will be transferred to Mailchimp for processing. Learn more about Mailchimp's privacy practices here.