Very draft version :) Ruby version: 1.9 Gems required: sqlite3, nokogiri, open-uri, htmlentities Basic Rake steps workflow (run rake -T for the whole list of available tasks) # Create cache folders and SQLite DB rake setup:init # OPTIONAL step: by default whole Switzerland timetables are fetched # if you want it only for a smaller area, run the following command # 'bounds' parameter contains 4 values, separated by comma: # corner SW longitude,corner SW latitude,corner NE longitude,corner NE latitude rake station:set_bounds bounds=8.53,47.41,8.61,47.46 # Discovers the stations in the studied area rake station:fetch # OPTIONAL step: exports the stations in tmp/station.csv for visualizing data (i.e. in QGIS) rake station:export # OPTIONAL step(although recommended): remove the stations outside of Switzerland's borders rake station:geo_clean # Fetches from SBB, the files containing departures for each station rake departure:fetch # Removes the files that contains errors. If any files are mentioned, you have to run again the previous step rake departure:files_clean # Inserts the timetables in DB rake timetable:parse # Remove timetable duplicates (based on departure, vehicle_id, destination) rake timetable:remove_duplicates # Remove stops of the stations that are now known (outside of Switzerland) rake timetable:remove_notknown_stations # Determine the station type from the timetables rake station:parse_type # OPTIONAL step: exports the stations in tmp/station.csv for visualizing data (i.e. in QGIS) rake station:export # Build the vehicle table based on departures rake vehicle:insert # Update the vehicle table with info for each vehicle and insert arrivals at destination in the timetables rake vehicle:build # Remove again stops of the stations that are now known (outside of Switzerland), which may be inserted by the previous step rake timetable:remove_notknown_stations # Remove vehicles that have only one station in Switzerland and the rest outside (like TGVs leaving to France from the border) rake vehicle:remove_onestopper # Run this step again because me might have stations without vehicle stops, so we don't want to display this station rake station:parse_type # Remove the stops of the vehicles that have duplicate consecutive stations but with different departure times. A bit similar with 'timetable:remove_duplicates' task rake vehicle:check_duplicate_stations