On the ŽSR DWH project, I had the task of building the ETL process from the staging layer to the core layer. The Oracle database was chosen for the backend due to its use in other company projects. The frontend or BI was not yet developed at that stage. At the beginning, I created dB lines to all source database systems of ŽSR, of which there were about 8 and all of them were located in Oracle. Subsequently, the data was pulled into the staging layer via MV. No further adjustments were made in this layer, only data refresh procedures were set according to the filling of Mlogs. All other adjustments were made in the so-called transformation layer, where the core tables were duplicated, which were deleted during processing and captured the changed data that went to the core layer. As for the internal logic, I wrote the PLSql code according to the assignment of the analysis. The sequence diagram was quite complexly written, so without direct communication with the analyst, it could hardly be understood. The data that was processed, for example, was data on parked vehicles, on delayed trains, on the efficiency of the work of train drivers or on the list of stations or station renovations and replacement connections. Each of the above-mentioned areas represented one datamart, to which the necessary dimensional tables were connected through the star diagram. Optimization of the scripts was done through indexes and parallelism of hints, which was not always sufficient for the selected processing.
The project showed that RDBMS Oracle is also suitable for processing etl processes in the case of smaller or medium tables. For more complex queries, I would rather choose Oracle ODi or in-memory tablespace or try bulk insert processing.