On the EVO DWH project, I had the task of developing the already established ETL process and adding new functionalities. The project was specific in that we did not have the possibility to connect to the environment through remote access, as in the case of other projects, so every change or optimization had to be implemented personally directly at the client. So one had to be prepared with a USB key with deployment scripts and in case of problems, he had to react flexibly to the situation and solve it calmly. There were three of us on the project, so we could help each other in the case of CR. The ETL process ran on the Oracle database through application jobs, and an internal application developed by our company was used as the front end for BI. The database, as in the case of ŽSR Dwh, was designed for three dB schemes: for StG-staging, TRF-transformation and DWH-datawarehouse. In the staging, data from external sources was pulled, and in the transformation layer, all adjustments were made to the data and the difference data were they were dragged into the DWH or core layer and they were already displayed on the resulting application. The deleted data were transactionally canceled via the so-called soft delete. The data was merged into the TRF layer in the PLSql programming language in batches within certain limits. Subsequently, they were pulled into the DWH by an application job once a day and so they could appear in the resulting application. The data that was processed, for example, was data on current and historical registration numbers of vehicles, data on vehicle owners and all technical details on vehicles. Each of the above areas represented one datamart, to which the necessary dimensional tables were assigned via the star diagram. Optimization of the scripts was done through indexes and parallelism through query hints. Due to the accumulation of a large number of records and the need to reduce the time of the application job process, which ran once a day, a lot of time was spent on exchanging emails with the client and scripts that the client had to run regarding the resulting processes, so I would rather choose another product for ETL in the future, such as ODi or open source Druid or Pentaho.
The project indicated that classic RDBMS through Oracle for ETL is effective in the case of tables with a smaller number of records. For more complex queries, I would rather choose another alternative from Oracle ODi or in-memory tablespace or try bulk insert processing. I would also reconsider merge and replace him by the union.