activewarehouse / activewarehouse-etl

Extract-Transform-Load library from ActiveWarehouse

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Documentation for removal of column during destination output

OpenCoderX opened this issue · comments

I would like to import three columns from source, make transformations based on all of them but only out put two columns to destination.

How can I do that?

Chris

Hi Chris!

first apologies: this gem really needs more simple examples like what you describe :) This is definitely planned.

For your comming questions: can you ask them on the google group if possible? Thanks!

https://groups.google.com/forum/?fromgroups#!topic/activewarehouse-discuss

Here's a rough example of what you could do for sources:

# declare a csv source with automatic fields naming based on first line
source :my_source, :file => 'data_input.csv', :skip_lines => 1, :parser => :csv

# or alternatively, specify the column names yourself
source :my_source, { :file => 'data_input.csv', :skip_lines => 1, :parser => :csv }, [ 'id', 'first_name', 'last_name' ]

you can then declare transformation at the row level or the field level (see https://github.com/activewarehouse/activewarehouse-etl/wiki/Documentation for more information):

# field-level transform
transform(:full_name) do |name, value, row|
  [row[:first_name], row[:last_name]].join(' ')
end

# row level transform
before_write do |row|
  row[:full_name] = [row[:first_name], row[:last_name]].join(' ')
  row # must be returned for row level tranforms
end

and for destination (columns are specified with :order, which is a bit confusing at first):

destination :out, { :file => 'output_file.csv' }, { :order => ['full_name'] }

Hope this helps!

Let me know if you have any issues via the google group.

Added to milestone 1.0.0 so that I don't forget to provide better "getting started" stuff.