alekrutkowski / wiod.diagrammer

R package for an easy work with WIOD (the 2016 release) including diagramming (flowcharts)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

wiod.diagrammer -- R package for an easy work with WIOD (the 2016 release) including diagramming (flowcharts)

Aleksander Rutkowski 2017-12-21

Installation

## if package `devtools` not installed, first do this:
# install.packages('devtools')
devtools::install_github('alekrutkowski/wiod.diagrammer')

Documentation

https://alekrutkowski.github.io/wiod.diagrammer/reference/index.html

Example

Load WIOD data for a given year from an official WIOD .Rdata file (release 2016), which has to be first downloaded from http://www.wiod.org/protected3/data16/wiot_ROW/wiot_r_Nov16.zip and extracted (unzipped).

library(wiod.diagrammer)
W <- loadWIOD('WIOT2014_October16_ROW.RData')
## Loading "WIOT2014_October16_ROW.RData"...
# Check the names of the first 10 columns:
cat(head(colnames(W), 20),
    sep='\n')
## IndustryCode
## IndustryDescription
## Country
## RNr
## AUS1
## AUS2
## AUS3
## AUS4
## AUS5
## AUS6
## AUS7
## AUS8
## AUS9
## AUS10
## AUS11
## AUS12
## AUS13
## AUS14
## AUS15
## AUS16
# How many columns and rows?
message(ncol(W),' columns; ',nrow(W),' rows')
## 2689 columns; 2472 rows

Now let's flatten (reshape into long format):

W_flat <- flatWIOD(W)
## Reshaping into long/flat format...
# See the structure of W:
str(W_flat)
## Classes 'data.table' and 'data.frame':   6613376 obs. of  5 variables:
##  $ ExpSectorNr: int  1 2 3 4 5 6 7 8 9 10 ...
##  $ ExpCountry : chr  "AUS" "AUS" "AUS" "AUS" ...
##  $ value      : num  12924.2 83 19.1 115.9 1590.8 ...
##  $ ImpSectorNr: int  1 1 1 1 1 1 1 1 1 1 ...
##  $ ImpCountry : chr  "AUS" "AUS" "AUS" "AUS" ...
##  - attr(*, ".internal.selfref")=<externalptr> 
##  - attr(*, "isFlatWIOD")= logi TRUE
head(W_flat, 10)
##     ExpSectorNr ExpCountry       value ImpSectorNr ImpCountry
##  1:           1        AUS 12924.17969           1        AUS
##  2:           2        AUS    83.02964           1        AUS
##  3:           3        AUS    19.14773           1        AUS
##  4:           4        AUS   115.92985           1        AUS
##  5:           5        AUS  1590.84059           1        AUS
##  6:           6        AUS    42.39361           1        AUS
##  7:           7        AUS    22.95618           1        AUS
##  8:           8        AUS    27.70757           1        AUS
##  9:           9        AUS    64.61939           1        AUS
## 10:          10        AUS   877.44099           1        AUS
tail(W_flat, 10)
##     ExpSectorNr ExpCountry         value ImpSectorNr ImpCountry
##  1:          47        ROW  3662.2757637          61        ROW
##  2:          48        ROW -9023.1820353          61        ROW
##  3:          49        ROW   523.1589139          61        ROW
##  4:          50        ROW 13628.6077185          61        ROW
##  5:          51        ROW  7020.4946769          61        ROW
##  6:          52        ROW 13934.2109353          61        ROW
##  7:          53        ROW  2960.8473143          61        ROW
##  8:          54        ROW  2171.8532215          61        ROW
##  9:          55        ROW   -73.4200616          61        ROW
## 10:          56        ROW     0.8743851          61        ROW

wiod.diagrammer uses internally the data.table package for performance. You can work with W_flat using the data.table's semantcs (e.g. the := in-place column generation/modification) or, if you prefer base R data.frames, you may convert W_flat into a regular data.frame with the function as.data.frame, do the modifications, and then convert it back into data.table with the function data.table::as.data.table for further processing (through the function findLinks described below).

Now let's find the top supplier-user linkages in WIOD (as defined by the column value in W_flat which -- if not modified -- corresponds to the intermediate consumption in W).

First let's introduce a helper function tieRobustRankLessOrEqual (available in wiod.diagrammer) comparing it with base R rank:

x <- c(1,1,2,2,2,3,3)
# A comparison:
print(data.frame(x,
                 rank(x) <= 2,
                 tieRobustRankLessOrEqual(x, 2),
                 check.names = FALSE))
##  x   rank(x) <= 2   tieRobustRankLessOrEqual(x, 2)  
##  1    TRUE           TRUE                           
##  1    TRUE           TRUE                           
##  2   FALSE           TRUE                           
##  2   FALSE           TRUE                           
##  2   FALSE           TRUE                           
##  3   FALSE          FALSE                           
##  3   FALSE          FALSE

Now, let's find e.g. the 3 main customers of German automotive industry as well as 2 main customers of those customers and 1 main customer of those customer's customers (we could go even deeper if we wanted or use different cut-off ranks, not necessarily in the declining order). So, we have 3 levels of linkages (the 1st one -- direct, the 2nd one and the 3rd one -- indirect). NB: Some industries may re-emerge at different rounds and they may be both customers and suppliers at the same time. Let's also differentiate our ranks (top 3, 2, and 1) by one more dimension: domestic vs foreign linkages.

W_flat[, domestic :=   # creating column in-place, following data.table's semantics
           ExpCountry == ImpCountry]
# Let's get rid of self-produced intermediate consumption
W_flat_noself <- W_flat[!(domestic &
                              ExpSectorNr == ImpSectorNr)]
# Let's keep only intra-EU trade
COUNTRIES_DT <- countries()
EU_COUNTRIES <-
    COUNTRIES_DT$Country[COUNTRIES_DT$isEUmember]
# Let's keep only flows >= 1 billion USD for clarity
W_flat_noself_truncated <- W_flat_noself[value >= 1000 &  # original WIOD data is in million USD
                                             ExpCountry %in% EU_COUNTRIES &
                                             ImpCountry %in% EU_COUNTRIES]
TOP_CUSTOMERS <-
    findLinks(partners = 'users',
              flat_wiod = W_flat_noself_truncated,
              start_countries = 'DEU', # We could add here other countries.
              start_sectors = 20, # "Manufacture of motor vehicles, trailers and semi-trailers"
              # We could add here other sectors.
              ListOfselectionFuns = # As discussed in the text above:
                  list(function(flat_wiod) tieRobustRankLessOrEqual(-flat_wiod$value, 3),  # minus because we
                       function(flat_wiod) tieRobustRankLessOrEqual(-flat_wiod$value, 2),  # want to rank from
                       function(flat_wiod) tieRobustRankLessOrEqual(-flat_wiod$value, 1)), # the highest to the lowest
              by = c('domestic','ExpCountry','ExpSectorNr')) # as discussed above
## Round 1...

## Selecting users (country and sector combinations)
## for each combination of: `domestic`,`ExpCountry`,`ExpSectorNr`...

## Accumulated 6 unique linkages.

## Round 2...

## Selecting users (country and sector combinations)
## for each combination of: `domestic`,`ExpCountry`,`ExpSectorNr`...

## Accumulated 10 unique linkages.

## Round 3...

## Selecting users (country and sector combinations)
## for each combination of: `domestic`,`ExpCountry`,`ExpSectorNr`...

## Accumulated 10 unique linkages.
str(TOP_CUSTOMERS)
## Classes 'data.table' and 'data.frame':   10 obs. of  6 variables:
##  $ ExpSectorNr: int  20 20 20 20 20 20 19 19 19 19
##  $ ExpCountry : chr  "DEU" "DEU" "DEU" "DEU" ...
##  $ value      : num  4627 9502 14825 6007 60939 ...
##  $ ImpSectorNr: int  57 57 57 19 57 60 60 60 20 60
##  $ ImpCountry : chr  "ESP" "FRA" "GBR" "DEU" ...
##  $ domestic   : logi  FALSE FALSE FALSE TRUE TRUE TRUE ...
##  - attr(*, ".internal.selfref")=<externalptr>
head(TOP_CUSTOMERS, 20)
##     ExpSectorNr ExpCountry     value ImpSectorNr ImpCountry domestic
##  1:          20        DEU  4626.951          57        ESP    FALSE
##  2:          20        DEU  9501.616          57        FRA    FALSE
##  3:          20        DEU 14825.307          57        GBR    FALSE
##  4:          20        DEU  6006.584          19        DEU     TRUE
##  5:          20        DEU 60939.054          57        DEU     TRUE
##  6:          20        DEU 24389.622          60        DEU     TRUE
##  7:          19        DEU  4695.542          60        GBR    FALSE
##  8:          19        DEU  4691.281          60        ITA    FALSE
##  9:          19        DEU  9941.646          20        DEU     TRUE
## 10:          19        DEU 37712.819          60        DEU     TRUE

Now let's do a similar exercise, just "upstream" i.e. for suppliers of suppliers.

TOP_SUPPLIERS <-
    findLinks(partners = 'suppliers',
              flat_wiod = W_flat_noself_truncated,
              start_countries = 'DEU', # We could add here other countries.
              start_sectors = 20, # "Manufacture of motor vehicles, trailers and semi-trailers"
              # We could add here other sectors.
              ListOfselectionFuns = # As discussed in the text above:
                  list(function(flat_wiod) tieRobustRankLessOrEqual(-flat_wiod$value, 3),  # minus because we
                       function(flat_wiod) tieRobustRankLessOrEqual(-flat_wiod$value, 2),  # want to rank from
                       function(flat_wiod) tieRobustRankLessOrEqual(-flat_wiod$value, 1)), # the highest to the lowest
              by = c('domestic','ImpCountry','ImpSectorNr')) # as discussed above
## Round 1...

## Selecting suppliers (country and sector combinations)
## for each combination of: `domestic`,`ImpCountry`,`ImpSectorNr`...

## Accumulated 6 unique linkages.

## Round 2...

## Selecting suppliers (country and sector combinations)
## for each combination of: `domestic`,`ImpCountry`,`ImpSectorNr`...

## Accumulated 24 unique linkages.

## Round 3...

## Selecting suppliers (country and sector combinations)
## for each combination of: `domestic`,`ImpCountry`,`ImpSectorNr`...

## Accumulated 38 unique linkages.
str(TOP_SUPPLIERS)
## Classes 'data.table' and 'data.frame':   38 obs. of  6 variables:
##  $ ExpSectorNr: int  20 20 20 15 16 28 15 15 24 31 ...
##  $ ExpCountry : chr  "CZE" "HUN" "POL" "DEU" ...
##  $ value      : num  5912 5165 4258 10347 15148 ...
##  $ ImpSectorNr: int  20 20 20 20 20 20 15 15 15 15 ...
##  $ ImpCountry : chr  "DEU" "DEU" "DEU" "DEU" ...
##  $ domestic   : logi  FALSE FALSE FALSE TRUE TRUE TRUE ...
##  - attr(*, ".internal.selfref")=<externalptr>
head(TOP_SUPPLIERS, 20)
##     ExpSectorNr ExpCountry     value ImpSectorNr ImpCountry domestic
##  1:          20        CZE  5912.361          20        DEU    FALSE
##  2:          20        HUN  5165.467          20        DEU    FALSE
##  3:          20        POL  4258.478          20        DEU    FALSE
##  4:          15        DEU 10346.647          20        DEU     TRUE
##  5:          16        DEU 15147.662          20        DEU     TRUE
##  6:          28        DEU 16858.074          20        DEU     TRUE
##  7:          15        FRA  2105.831          15        DEU    FALSE
##  8:          15        ITA  2425.400          15        DEU    FALSE
##  9:          24        DEU  5240.981          15        DEU     TRUE
## 10:          31        DEU  3520.693          15        DEU     TRUE
## 11:          15        DEU  9825.688          16        DEU     TRUE
## 12:          50        DEU  4845.994          16        DEU     TRUE
## 13:          15        ITA  1041.497          16        DEU    FALSE
## 14:          13        CZE  1926.013          20        CZE     TRUE
## 15:          28        CZE  1525.946          20        CZE     TRUE
## 16:          20        DEU  3192.376          20        CZE    FALSE
## 17:          20        POL  1082.968          20        CZE    FALSE
## 18:          19        DEU  2446.414          20        HUN    FALSE
## 19:          20        DEU  3146.213          20        HUN    FALSE
## 20:          20        DEU  3111.205          20        POL    FALSE

Now, let's plot the "upstream" linkages, making all the German sectors blue. By default, the cross-border flows are dashed, while the domestic flows are solid lines (arrows). The numbers in the nodes (rectangles) represent, by default, the sector output (or the total intermediate consumption for the final use sectors such as final consumption or investment if they show up in the customers' graphs).

In wiod.diagrammer the rendering of the plot is done internally by DiagrammeR::grViz. The plots can be saved manually in RStudio, or programmatically e.g. to an .svg file via DiagrammeRsvg::export_svg to a character vector and then cated, or to a .png file piping them through DiagrammeRsvg::export_svg, charToRaw and rsvg::rsvg_png.

plotLinks(top_links_dt = TOP_SUPPLIERS,
          wiot = W, # this is necessary
          specificNodeOptionsFun =  # this is optional, just to show-off:
            function(country_sector_dt)
                ifelse(country_sector_dt$Country=='DEU',
                       'style=filled, fillcolor=cadetblue1', "")) # GraphViz colour names can be found at:
                                                                  # http://www.graphviz.org/doc/info/colors.html

Click on the picture to zoom in:

Graph

What else is available in the package? The functions which produce auxiliary data.tables (that are used by wiod.diagrammer's plotLinks function if evaluated with default argument values).

COUNTRIES <- countries() # NB: no argument to function `countries`
str(COUNTRIES)
## Classes 'data.table' and 'data.frame':   45 obs. of  3 variables:
##  $ CountryLab: chr  "Australia" "Austria" "Belgium" "Bulgaria" ...
##  $ Country   : chr  "AUS" "AUT" "BEL" "BGR" ...
##  $ isEUmember: logi  FALSE TRUE TRUE TRUE FALSE FALSE ...
##  - attr(*, ".internal.selfref")=<externalptr>
print(COUNTRIES)
##  CountryLab         Country   isEUmember  
##  Australia          AUS       FALSE       
##  Austria            AUT        TRUE       
##  Belgium            BEL        TRUE       
##  Bulgaria           BGR        TRUE       
##  Brazil             BRA       FALSE       
##  Canada             CAN       FALSE       
##  Switzerland        CHE       FALSE       
##  China              CHN       FALSE       
##  Cyprus             CYP        TRUE       
##  Czech Republic     CZE        TRUE       
##  Germany            DEU        TRUE       
##  Denmark            DNK        TRUE       
##  Spain              ESP        TRUE       
##  Estonia            EST        TRUE       
##  Finland            FIN        TRUE       
##  France             FRA        TRUE       
##  United Kingdom     GBR        TRUE       
##  Greece             GRC        TRUE       
##  Croatia            HRV        TRUE       
##  Hungary            HUN        TRUE       
##  Indonesia          IDN       FALSE       
##  India              IND       FALSE       
##  Ireland            IRL        TRUE       
##  Italy              ITA        TRUE       
##  Japan              JPN       FALSE       
##  Korea              KOR       FALSE       
##  Lithuania          LTU        TRUE       
##  Luxembourg         LUX        TRUE       
##  Latvia             LVA        TRUE       
##  Mexico             MEX       FALSE       
##  Malta              MLT        TRUE       
##  Netherlands        NLD        TRUE       
##  Norway             NOR       FALSE       
##  Poland             POL        TRUE       
##  Portugal           PRT        TRUE       
##  Romania            ROU        TRUE       
##  Rest of the World  ROW       FALSE       
##  Russian Federation RUS       FALSE       
##  Slovak Republic    SVK        TRUE       
##  Slovenia           SVN        TRUE       
##  Sweden             SWE        TRUE       
##  TOTAL              TOT       FALSE       
##  Turkey             TUR       FALSE       
##  Taiwan             TWN       FALSE       
##  United States      USA       FALSE
SECTORS <- sectors(W)
SECTORS[, SectorLab :=  # truncate the long sector labels just for clarity below
            substr(SectorLab, 1, 30)]
str(SECTORS)
## Classes 'data.table' and 'data.frame':   61 obs. of  4 variables:
##  $ SectorNr  : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ SectorLab : chr  "Crop and animal production, hu" "Forestry and logging" "Fishing and aquaculture" "Mining and quarrying" ...
##  $ SectorCode: chr  "A01" "A02" "A03" "B" ...
##  $ isFinal   : logi  FALSE FALSE FALSE FALSE FALSE FALSE ...
##  - attr(*, ".internal.selfref")=<externalptr>
head(SECTORS, 20)
##     SectorNr                      SectorLab SectorCode isFinal
##  1:        1 Crop and animal production, hu        A01   FALSE
##  2:        2           Forestry and logging        A02   FALSE
##  3:        3        Fishing and aquaculture        A03   FALSE
##  4:        4           Mining and quarrying          B   FALSE
##  5:        5 Manufacture of food products,     C10-C12   FALSE
##  6:        6 Manufacture of textiles, weari    C13-C15   FALSE
##  7:        7 Manufacture of wood and of pro        C16   FALSE
##  8:        8 Manufacture of paper and paper        C17   FALSE
##  9:        9 Printing and reproduction of r        C18   FALSE
## 10:       10 Manufacture of coke and refine        C19   FALSE
## 11:       11 Manufacture of chemicals and c        C20   FALSE
## 12:       12 Manufacture of basic pharmaceu        C21   FALSE
## 13:       13 Manufacture of rubber and plas        C22   FALSE
## 14:       14 Manufacture of other non-metal        C23   FALSE
## 15:       15    Manufacture of basic metals        C24   FALSE
## 16:       16 Manufacture of fabricated meta        C25   FALSE
## 17:       17 Manufacture of computer, elect        C26   FALSE
## 18:       18 Manufacture of electrical equi        C27   FALSE
## 19:       19 Manufacture of machinery and e        C28   FALSE
## 20:       20 Manufacture of motor vehicles,        C29   FALSE
AGGREGATES <- aggregates(W)

The variable/column names produced by the function aggregates reflect those in WIOD. They are explained in the documentations of aggregates.

str(AGGREGATES) 
## Classes 'data.table' and 'data.frame':   2684 obs. of  10 variables:
##  $ SectorNr: int  1 2 3 4 5 6 7 8 9 10 ...
##  $ Country : chr  "AUS" "AUS" "AUS" "AUS" ...
##  $ II_fob  : num  39039 925 1205 72426 58385 ...
##  $ TXSP    : num  501.9 68.1 51.7 297.6 631.8 ...
##  $ EXP_adj : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ PURR    : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ PURNR   : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ VA      : num  30489 1570 1895 98315 24247 ...
##  $ IntTTM  : num  261.7 22.2 23.9 946.5 240.3 ...
##  $ GO      : num  70292 2585 3175 171985 83504 ...
##  - attr(*, ".internal.selfref")=<externalptr>
head(AGGREGATES, 20)
##     SectorNr Country     II_fob       TXSP EXP_adj PURR PURNR        VA
##  1:        1     AUS 39039.2389 501.926754       0    0     0 30489.190
##  2:        2     AUS   925.1227  68.141295       0    0     0  1569.899
##  3:        3     AUS  1204.7060  51.732099       0    0     0  1894.738
##  4:        4     AUS 72425.8720 297.643163       0    0     0 98315.120
##  5:        5     AUS 58384.5017 631.807491       0    0     0 24247.396
##  6:        6     AUS  2605.2096  66.295431       0    0     0  2335.771
##  7:        7     AUS  5553.2449  36.009904       0    0     0  3382.998
##  8:        8     AUS  5385.5547   8.013658       0    0     0  2463.259
##  9:        9     AUS  4671.9083  16.344716       0    0     0  3042.421
## 10:       10     AUS 19239.5228 583.397244       0    0     0  3751.804
## 11:       11     AUS 10468.8475 120.183333       0    0     0  5333.571
## 12:       12     AUS  6321.7307   9.907760       0    0     0  3360.232
## 13:       13     AUS  6324.4477  35.313103       0    0     0  4251.740
## 14:       14     AUS  9868.0654  62.421243       0    0     0  5473.808
## 15:       15     AUS 37334.7513 269.529392       0    0     0  4766.247
## 16:       16     AUS 15617.1068  97.862928       0    0     0 10112.567
## 17:       17     AUS  1926.6851   4.258397       0    0     0  4016.797
## 18:       18     AUS  3637.6790  12.062149       0    0     0  2265.652
## 19:       19     AUS  8586.6225  30.389956       0    0     0  4993.905
## 20:       20     AUS 10571.1618  70.617707       0    0     0  3167.178
##         IntTTM         GO
##  1:  261.67872  70292.034
##  2:   22.21674   2585.380
##  3:   23.86781   3175.044
##  4:  946.48747 171985.122
##  5:  240.33129  83504.037
##  6:   56.74823   5064.024
##  7:   27.52288   8999.775
##  8:   68.40201   7925.230
##  9:   53.40740   7784.082
## 10:  880.29705  24455.020
## 11:  238.38261  16160.985
## 12:   92.86966   9784.740
## 13:  173.11259  10784.614
## 14:  133.81736  15538.112
## 15: 1214.13677  43584.665
## 16:  274.81625  26102.353
## 17:   85.19562   6032.936
## 18:   98.41985   6013.813
## 19:  210.65336  13821.571
## 20:  278.51840  14087.476

About

R package for an easy work with WIOD (the 2016 release) including diagramming (flowcharts)


Languages

Language:R 100.0%